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Chapter 1 


An informal introduction to 
the derivative 


1.1 Review: functions and the slope of a linear 
function 


Calculus is the study of rates of change, and of how change accumu- 
lates. For example, figure a shows the changes in the United States 
stock market over a period of 24 years. The y axis of this graph 
is a certain weighted average of the prices of stock, and the x axis 
is time, measured in years. This is an example of the concept of 
a mathematical function, which you’ve learned about in a previous 
course. We say that the stock index is a function of time, meaning 
that it depends on time. What makes this graph the graph of a 
function is that a vertical line only intersects it in one place. This 
means that at any given time, there is only one value of the index, 
not more than one. 


Figure a shows a function that was determined by measurement 
and observation, but functions can also be defined by a formula. For 
example, we could define a function y by stating that for any number 
x, the value of the function is given by y(a) = x7. We sometimes 
state this kind of thing more casually by referring to “the function 


y = x?” or “the function 27.” 


I drew figure a by graphing yearly data, so it’s made of line 
segments that connect one year to the next. Each of these line 
segments has a slope, defined as 

slope = ae (1) 
LQ — Ly 
The slope measures how fast the function is changing. A positive 
slope says the function is increasing, negative decreasing. If the 
slope is zero, the function is not changing at all. 


It’s often convenient to express this kind of thing with the no- 
tation A, the capital Greek letter delta, which is the equivalent of 
our Latin “D” and here stands for “difference.” In terms of this 
notation, we have 


Ay 
lope = —. 2 
slope - (2) 


A symbol like Ay indicates the change in y, Ay = yo—y1. It doesn’t 
mean a number A multiplied by a number y. 


oa foe} 


(relative to 1990) 
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S&P 500 index 
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a/The S&P 500 stock index 
is a function of time. 


ay 





b / Given two points on a line, we 
can find its slope by computing 
Ay/Ax, the rise over the run. 
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1.2 The derivative 


1.2.1 An informal definition of the derivative 


In many real-world applications, it makes sense to think of change 
as occurring smoothly and continuously. For example, the level of 
water in a reservoir rises and falls with time. Although it’s true that 
this change happens one molecule at a time, so that in theory there 
are abrupt jumps, these jumps are too tiny to matter in practice. 


c/The original graph, on the left, shows the water level in Trinity Lake, California, for the thirty-day 
period beginning March 7, 2014. Each successive magnification to the right is by a factor of four. 


d/ The tangent line at a point on 
a curved graph. 


We want to keep track of the net rate of flow into the reservoir. 
We would like to define this rate as the slope of the graph, but the 
graph isn’t a line, so how do we do that? We could pick two points 
on the graph and connect them with a line segment, but that would 
only represent an average rate of flow, not the actual rate of flow as 
it would be measured by a flow gauge at one particular time. 


To get around these difficulties, we imagine picking a point of 
interest on the graph and then zooming in on it more and more, 
as if through a microscope capable of unlimited magnification. As 
we zoom in, the curviness of the graph becomes less and less ap- 
parent. (Similarly, we don’t notice in everyday life that the earth 
is a sphere.) In figure c, we zoom in by 400%, and then again by 
400%, and so on. After a series of these zooms, the graph appears 
indistinguishable from a line, and we can measure its slope just as 
we would for a line. This is an intuitive description of what we 
mean by the slope of a function that isn’t a line. We call this slope 
the derivative of the function at the point of interest. This is ad- 
mittedly not a mathematically rigorous definition, but it fixes our 
minds on the concept we want. A useful example is that if we con- 
sider a car’s odometer reading as a function of time in hours, then 
its speedometer reading is the derivative of the odometer reading. 


If we were only shown the ultra-magnified view in the rightmost 
part of figure c, we wouldn’t know that the graph was curved at all. 
We would think the whole thing was a line. This hypothetical line 
is called the tangent line at the point marked with a dot. When 
you stand on the earth’s surface and look at a point on the horizon, 
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your line of sight is a tangent line to the surface. The derivative of 
a function is the slope of the tangent line. 


1.2.2 Locality of the derivative 


From this informal definition it seems that the derivative of a 
function at a certain point should depend only on the behavior of the 
function near that point, not far away. To state this idea precisely, 
we need to use some notation referring to sets, reviewed in box 1.1, 
and intervals. 


Often it is useful to define a set of all the real numbers that lie 
within a certain range, between numbers a and b. This is called an 
interval. We can define intervals that contain or don’t contain their 
endpoints. 


Definition 
type of interval definition abbreviation 
closed {zjc >aandx<b} [ad 
open {zjc >aandx<b} (a,b) 


We can also have intervals like [a,b) and (a,b], which are de- 
fined in the obvious way. A similar notation for infinite intervals is 
introduced in problem i4, p. 41. 


Locality of the derivative 

The derivative is local, in the following sense. Suppose there 
is an interval I = (a,b) on which the functions f and g are 
equal. That is, for any x € I, f(x) = g(x). Then at any point 
in I, the derivatives of f and g are the same. 





e/Fred and Ginger are both driving on the freeway. As Ginger is 
about to pass Fred, she notices a motorcycle cop, so she abruptly 


decelerates and then stays alongside Fred. The derivative of their 
position is their speed. The derivative is local, so by the time the cop 
measures their speeds, at point P, they are the same. 


>Box 1.1 Sets 


A set is a collection of 
things. The things can, for ex- 
ample, be numbers. They can 
even be other sets. A set can 
be defined by listing the things 
it holds, which are called its el- 
ements or members. For exam- 
ple, the solutions of the equa- 
tion x? = 1 are the members of 
the set {—1,1}. Often we deal 
with infinite sets such as the set 
of all the natural numbers, and 
it is then impossible to list all 
the elements. Instead, we can 
define a set using notation like 
this: 


Sve (aaa 10k 


read as, “the set of all x such 
that x squared is greater than 
zero.” Often, as in this ex- 
ample, we don’t explicitly say 
what to consider as the possi- 
ble values of x; since the focus 
of calculus is on real numbers, 
the implication in this course is 
usually that “the set of all x 
such that ...” means “the set 
of all real numbers x such that 


x ” 
The notation € means “is a 
member of,” e.g., 1 € S for the 
set S defined above. 


Two sets are the same if 
they have the same members. 
For example, let 


= ial e0) and 
Vis Agger 


Because S, T, and U have the 
same members, they are equal, 
S= 2 =U 


Section 1.2 The derivative 15 


same 
[> stone 
shift 
horizontal Negated 
flip slope 
negated 
- slope 
vertical 
flip 
decreased 
horizontal slope 
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a vertical 
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f/ Some 
derivative. 






increased 
slope 


properties of the 


>Box 1.2 Ideas about 
proof: stating your as- 
sumptions 


The properties listed here 
can be used to solve problems, 
as in section 1.2.4, where we'll 
calculate the derivative of the 
function y = x”. But math 
isn’t just calculation. We also 
want to prove general facts. A 
proof always requires certain 
starting assumptions, e.g., you 
can’t prove to a friend that 
cap-and-trade is the best way 
to deal with global warming if 
your friend won’t admit that 
global warming exists. This list 
of properties includes enough 
assumptions to prove quite a 
few general facts about deriva- 
tives. 
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1.2.3. Properties of the derivative 


The following properties of the derivative are intuitively reason- 
able based on our conceptual definition, and they will be enough 
to allow us to do quite a bit of interesting calculus before we come 
back and make a more general definition. 


constant The derivative of a constant function is zero. 
line The derivative of a linear function is its slope. 


shift Shifting a function y(x) horizontally or vertically to form a 
new function y(z + a) or y(a) + 6 gives a derivative at any 
newly shifted point that is the same as the derivative at the 
corresponding point on the unshifted graph. 


flip Flipping the function y(x) horizontally or vertically to form a 
new function y(—x) or —y(x) negates its derivative at corre- 
sponding points. 


addition The derivative of the sum of two functions is the sum of 
their derivatives. 


stretch Stretching a function y(«) vertically to form a new func- 
tion ry(x) multiplies its derivative by r at the corresponding 
points, while stretching it horizontally to make y(x/s) divides 
its derivative by s. 


no-cut Suppose that for a certain point P on the graph of a func- 
tion, there is a unique linear function ¢ that passes through 
P but doesn’t cut through the graph at P. Then the graph of 
£ is the tangent line, and the derivative of the function at P 
equals the slope of the line. 


As an example of the stretch rule, cars sold in the U.S. have 
odometers that read out in units of miles, while those sold elsewhere 
are calibrated in kilometers, so their readings are greater by the 
conversion factor r = 1.6. By the stretch property, cars outside the 
U.S. also have speedometer readings that are greater by this factor: 
they read out in kilometers per hour. 


There is usually, but not always, a line like the one described by 
the no-cut property. Sometimes there is a tangent line but it isn’t 
a no-cut line. If this kind of mathematical puzzle interests you, try 
sketching the graphs of the functions x? and \/z. You should be 
able to convince yourself that their tangent lines at « = 0 can’t be 
described by no-cut functions. 


By the way, these are just names I’ve given to these properties, 
and if you use them with other people, they won’t know what you 
mean. Once we’ve done more calculus, we’ll see that several of these 
properties are actually special cases of a more general rule called the 
chain rule. 
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1.2.4 The derivative of the function y = x? 


As our first example of a derivative, let’s use the function y = 2?. 


Its graph is a parabola. The simplest point at which to find its 
derivative is x = 0, the central point of the graph. From figure 
g, it seems like zooming in more and more on this point would give 
something that looked more and more like a horizontal line, and this 
suggests that the derivative at this point is zero. We can confirm this 
by using the flip property. Flipping the graph horizontally across 
the y axis doesn’t change the graph. (Recall that a function with 
this symmetry is called an even function.) Since the flip doesn’t 
change the function, it can’t change the derivative of the function. 
But the flip rule says that when we flip a function, the derivative 
is negated at the corresponding point on the new graph. Here the 
point of interest is x = 0, and that point doesn’t move when we flip 
it, so its corresponding point on the new graph is the same point. 
Thus the derivative at x = 0 must be the same as itself, but also 
equal to minus itself. Zero is the only number that remains the same 
when we reverse its sign, so the derivative at the center of the graph 
is zero. 


How about the derivative at the point x = 1? Here we can apply 
the no-cut rule. By laying a ruler against this point, we find that the 
linear function ¢(x) = 2%—1 seems to intersect the parabola without 
cutting across it. To prove that this is true, we can compute the 
difference between the two functions, y(x) — &(x) = a? — 2x + 1. 
Completing the square allows us to rewrite this as (x — 1)?, which 
is clearly positive for any value of x other than 1. Therefore the 
function £ meets the conditions of the no-cut rule, and the derivative 
of x? at x = 1 is 2. 


Having found the derivative of x? at 2 = 1, we can now use the 
stretch rule to find it at any other point. For example, suppose we 
want to know the derivative at « = 3. If we were to take the graph 
of the function 2? and stretch it by a factor of 3 horizontally and 
9 vertically, we would get the same graph again. These stretches 
take the point (1,1), where we know the derivative, to the point 
(3,9), where we want to know it. The stretch rule tells us that 
the horizontal stretch decreases the derivative to 1/3 of its original 
value, but the vertical stretch increases it by 9 times, so that over 
all, the derivative at (3,9) is (1/3)(9) = 3 times greater than its 
value at (1,1). Thus the derivative at x = 3 equals 6. 


There is nothing special about the number 3. The method that 
we applied to « = 3 would work for any other number x, not just 
for 3. We find that the derivative of the function x? at any point 
x equals 2x. Taking stock of what we’ve done, we started with the 
function x”, and found that at any point x, the derivative was 2x. 

















tt X 
-3 -2 -1 Ie 22-3 
g/The function y = x?. 
X 
5A 
q+ 
pt 
1 2 


h/The line 2x — 1 intersects 
the function x? without cutting it. 
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Chapter 1 





> The derivative of a function is a function itself. 


We’ve found that the derivative of the function x? at a point x 
equals 2x. The expression 2a” can be thought of as a function of x. 
So what we’ve really done is to take a function and construct a new 
function that gives the derivative of the original function at each 


point. One way of notating this new function is y’, read “y prime.” 
We have 


The craft of finding this kind of derivative-function from the original 
function is called differentiation. We have differentiated the function 
x” and gotten its derivative, the function 22. 


Hiking Example 1 
Figure j shows a graph of my favorite route for climbing a moun- 
tain near where | live. (My wife rolls her eyes when | tell her the 
dog and | are doing this hike yet again.) How steep is the hike? 
There is no generic answer to this question, since the derivative 
of this function is itself a function. The derivative depends on 
X, so it has different values in different places. The slope of the 
graph at point P appears to be the steepest, with y’ ~ 0.80. At 
other points, y’ has smaller values. At Q, it’s slightly negative. 
The derivative y’ is a function of x; it depends on which part of 
the hike you’re presently climbing. 


An indifference curve Example 2 
Let’s say you enjoy beer, and you also enjoy sushi. How much 
would you prefer to have of each? Economists define a graph, 
figure k, called an indifference curve. For a particular person, any 
two points on the curve are supposed to be equal in preference; 
the person is indifferent as to which one they get. For example, 
the person whose indifference curve is drawn in figure k is equally 
happy having one piece of sushi and five beers, or having three 
pieces of sushi and two beers. 


There is a quantity called the marginal rate of substitution (MRS), 
which is defined as minus the slope of the indifference curve, —y’. 
At point P in figure k, the MRS is high, which means that the per- 
son would have been just as happy to have another piece of sushi 
and a Jot less beer. The MRS, —y’, is a function of where you are 
on the curve. If the person is at point Q on the graph, they have 
a moderate amount of beer and a moderate amount of sushi, 
so they consider them of more comparable value. Indifference 
curves are discussed further in section 3.4.3, p. 89. 


An informal introduction to the derivative 


What if x is in the exponent rather than the base? Example 3 
The method used above to differentiate x* was basically a trick, 
and it depended on a special property of the function x*, which is 
that its graph can be stretched horizontally and vertically in such 
a way that it can be brought back on top of itself again. The 
reason that this subject is called “calculus” rather than “trickery” 
is that we will soon (in ch. 2) develop more systematic methods 
for calculating rates of change — methods that don’t depend on 
tricks. 


It may nevertheless be of interest to note that a similar trick is 
capable of telling us something about a different type of func- 
tion, one in which x appears in the exponent rather than the 
base. What about the function 2”, for example? A pair of rabbits 
marches off of Noah’s ark. Two bunnies become four, then 8, 16, 
32, and so on. What is the derivative of this function, i.e., the rate 
of change of the rabbit population per generation? (Strictly speak- 
ing, the derivative is only meaningful if we fill in all the non-integer 
values of x, which isn’t really meaningful in terms of rabbits, since 
you can’t have a fraction of a rabbit.) 


It happens that the function 2%, like x*, can be brought back on 
top of itself again in a simple geometrical way. Instead of a hori- 
zontal stretch and a vertical stretch, we use a horizontal shift and 
a vertical stretch. For example, if we shift the graph of 2* to the 
right by 3 units, and then stretch it vertically by a factor of 8, we get 
back the same graph again. This has come about because of the 
more fundamental property of exponential functions b°t? = b°b?. 
(In our example, the base bis 2.) As a result, we find that after 3 
generations, when the rabbit population goes up by a factor of 8, 
its derivative also goes up by a factor of 8. That is, the derivative 
of an exponential function y = b* is proportional to y, or 


where “...” is a constant of proportionality that depends on the 
base b. What is the constant of proportionality? We'll return to 
this question in example 6 on p. 51. 


A similar example is credit card debt. The more credit card debt 
you have, the faster your debt grows; in this example, the constant 
of proportionality relates to the interest rate. 


Discussion question 


A What is wrong with the logic of the following argument? You should 
believe in God, because if you don’t, when you die you'll go to Hell. 


Refer to box 1.2 on p. 16. 
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>Box 1.3 Ideas 
proof: examples 
prove a rule 


An example can’t prove a 
general rule. French is the offi- 
cial language of Cote d’Ivoire, 
but that doesn’t prove that 
it’s the official language of all 
of Africa. In fact there are 
other countries in Africa, such 
as Egypt, that speak differ- 
ent languages, such as Ara- 
bic. In general, an example can 
never prove a general rule, but 
a countererample (Egyptians 
speaking Arabic) can disprove 
a rule (all of Africa speaking 
French). 














|1/Example 4. The top graph 
shows the original function, the 
bottom its derivative. 
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Derivatives of powers and polynomials 


In section 1.2.4, we found that the derivative of x? was 2x. Straight- 
forward application of the same technique to x? gives 327. We see 
a pattern: 


Derivatives of powers 
The derivative of 2” equals na”~1, if n is any integer greater than 
or equal to 1. 


Observing the pattern or giving examples is not enough to prove 
this general rule (box 1.3). To prove this for all these values of n, 
rather than carrying out the proof for one value at a time, it will 
be more convenient to use techniques developed later in the book 
(section 2.6, p. 57). 


If we combine this with the addition and stretch rules, we know 
enough to differentiate any polynomial. 


Differentiating a polynomial 
> Find the derivative of y = x? — 7x +1. 


Example 4 


> The addition property of the derivative tells us that we can break 
this problem down into three parts, 


(C= Tay SbCl Ty EY, 


where the primes indicate “derivative of ...” The stretch property 
says that (—7x)’ is the same as (—7)(x)’, so the derivative of our 
polynomial becomes 


(x°)! + (—7) (x)! + (1)/. 


We know how to differentiate powers: (x°)’ = 3x?, (x’) = 1, and 
(1)’ = 0. (We could have found the second term from the line 
property, and the final one from the constant property.) The result 
is 


Vso =7. 


The functions y and y’ are graphed in figure |, and five points are 
marked as examples of how the slope of y corresponds to the 
value of y’. Reading across from left to right on the top graph, 
the slopes are positive, zero, negative, zero, and positive. On the 
bottom graph, the values of y’ are easily seen to be positive, zero, 
negative, zero, and positive. 


An informal introduction to the derivative 


1.4 Two trivial hangups 
1.4.1 Changing letters of the alphabet 


The following point is relatively trivial, but nevertheless hangs 
up many students in applying calculus to real life. In a calculus text- 
book, we typically use the letters x and y, with y being a function 
of x. That is, x is the independent variable, and y is the dependent 
one. In real-life applications, however, the variables have definite 
meanings, and we want to use letters that make it easy to remem- 
ber what they stand for. 


For example, suppose that a social media company has a certain 
number of users, and they need to have enough computing power 
at their data center to be able to handle all of those users. This 
computing power will cost them a certain amount of money per 
month. In this example, it would be natural to use the notation u 
for the number of users, and c for the monthly cost in dollars. Then 
c depends on u, and we have a function c(u). Let’s say the function 
is this: 

c=u" 

This is not an unrealistic equation to imagine for this example, since 
the company has to keep track of every user’s relationship to every 
other user. For example, user Andy may be able to mark himself 
as a “fan” or “follower” of user Betty, and then the company has to 
store a piece of information in a database to record this relationship. 
If there are a thousand users, there are 1000 x 1000 or a million such 
possible relationships that may need to be stored in a database. 


Now if the company’s user base is growing, it’s of interest to 
them to know how much their costs will go up for each additional 
user (the marginal cost). This would be expressed by the derivative 
c'(u). Although the letters of the alphabet are different than the 
ones we used in our earlier examples, that makes no difference in 
how we do the math. If differentiating y = x? with respect to x 
gives y’ = 2x, then differentiating c = u? with respect to u gives the 
same result but with the letters changed, 


/ 
CH] 2u. 


1.4.2 Symbolic constants 


The vertical stretch property of the derivative tells us that if we 
know a derivative such as 


()' = 22, 
then we can differentiate a function like 5x? by simply letting the 
factor of 5 “come along for the ride,” 


(547)! = (5)(a?)’ 
= (5)(22) 
= 102. 
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Now suppose that we want to differentiate bx?, where b is a constant, 
i.e., b doesn’t depend on x. To many students this looks like a much 
more difficult and abstract problem, but the procedure is the same: 


(ba*)' = (b)(a*)’ 


The same goes for a vertical shift. If we aren’t intimidated by com- 
puting 

(2? +5) = (2?) = 2a, 
then there is no reason to be scared of the similar computation 
(again with b being a constant) of 


(a? +b) = GY S Oe: 


1.5 Applications 


1.5.1 Velocity 
Defining velocity 


One of our prototypical examples has been the odometer and 
speedometer on a car’s dashboard. In fact, if we want to define 
what velocity means, we have to define it as a derivative. Suppose 
an object (it could be a car, a galaxy, or a subatomic particle) is 
moving in a straight line. By choosing a unit of distance and a 
location that we define as zero, we can superimpose a number line 
onto this line. (In the example of the car, the unit of distance might 
be kilometers, and the zero position would be the point on the road 
at which we last pushed the button to zero the odometer.) Let the 
position defined in this way be x. Then x is a function of time t 
(such as the time measured on a clock), and we notate this function 
as x(t). Note that although we typically use the letters x and y ina 
generic mathematical context, with y being a function of x, in our 
present example it is more natural to use different letters, and now 
x is the dependent variable, not the independent one. That is, x is 
a function of t, but t may not be a function of x; for example, if a 
car stops and backs up, then it can visit the same position twice, 
so that a graph of t versus x would fail the vertical line test for a 
function. In this notation, the velocity v is defined as the derivative 


v(t) = 2" (Et). 


Constant acceleration 


An important special case is the one in which the position func- 


tion is of the form 
(t) = tat? 
x(t) = ~at’, 
2 
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where a is a constant, and the factor of 1/2 is conventional, and 
convenient for reasons that will become more apparent in a moment. 
Differentiating with respect to t, we have the velocity function 


where the symbolic constant a has been treated like any other con- 
stant, and the 1/2 in front has been canceled by the factor of 2 that 
comes down from the exponent. We see that the velocity is pro- 
portional to the amount of time that has passed. If t is measured 
in seconds and v in meters per second (m/s), then the constant a, 
called the acceleration, tells us how much speed the object gains with 
every second that goes by, in units of m/s/s, which can be written 
as m/s’. Falling objects have an acceleration of about 9.8 m/s?. 
This is a measure of the strength of the earth’s gravity near its own 
surface. 


Dropping a rock down a well Example 5 
> Looking down into a dark well, you can’t see how deep it is. If 
you drop a rock in and hear it hit the bottom in 2 seconds, how 
deep is the well? 


> 
, 
x(t) = gat ~ 20m 


The shift property applied to constant acceleration Example 6 
The equations for constant acceleration were given above with 
the unstated assumption that both the position and the velocity 
would be zero at the time tf = 0. If we relax this assumption, then 
the position function can be of the more general form 


{ 
X(t) = Xo + pat = to)*, 
where f, is some initial time, at which the position equals xo. By 
the shift property of the derivative (p. 16), the velocity function is 
then 
v(t) = a(t — f). 


1.5.2 When do you need a derivative? 


Finding velocity from position data is a classic application of 
calculus, and yet how do we know when we really need calculus for 
this application? After all, many people do simple computations 
involving velocity without knowing calculus. 


Here’s an example where calculus really is required. In July 
1999, Popular Mechanics carried out tests to find which car sold by 
a major auto maker could cover a quarter mile (402 meters) in the 
shortest time, starting from rest. Because the distance is so short, 


Section 1.5 Applications 


23 


a 
28 
LL 5 67 
~CfR 
Vue 
ou 
9 2 
> =-S 
Yo oT 
ci 





| 
50 100 
fare, f 


(Swiss francs) 





m/Revenue from a tram as 
a function of the fare charged. 


this type of test is designed mainly to favor the car with the greatest 
acceleration, not the greatest maximum speed (which is irrelevant 
to the average person). The winner was the Dodge Viper, with a 
time of 12.08 s. If we divide the distance by the time, we get 


C= = = 33.3 m/s, 

which is about 74 miles per hour or 120 kilometers per hour. Not 
a very impressive speed, is it? That’s because it’s wrong. During 
those twelve seconds of acceleration, the car didn’t have just one 
speed. It started at a velocity of zero and went up from there. The 
top speed was nearly double the one calculated above (53 m/s 
119 mi/hr ~ 191 km/hr). The important point here is that when 
we measure a rate of change using an expression of the form 


ee 


est 


we only get the right answer if the rate of change is constant. In 
this example the rate of change is the velocity, and the velocity is 
not constant. To find the correct velocity, we first need to decide 
at which time we want to know the velocity, and then evaluate the 
derivative at that time. 


1.5.3 Optimization 


An extremely important use of the derivative is in optimiza- 
tion. For example, suppose that the operators of a privately owned 
mountain tram in Switzerland want to optimize their profit from 
transporting sightseers to a mountain summit in the Alps. The cost 
of building the tram is a sunk cost, and operating it for one day 
costs the same amount of money regardless of the number of pas- 
sengers. Therefore the only goal is to get the maximum number 
of Swiss francs in the cash registers at the end of each day. The 
operators can raise the fare f in order to make more money, but if 
the fare is too high then not as many people will be willing to pay 
it. Suppose that the number of riders in a given day is given by 
a— bf, where a and 6 are constants. That is, if the ride was free, a 
passengers would ride each day, but for every one-franc increase in 
the fare, b people will decide not to go. The tram’s daily revenue is 
then found by multiplying the number of riders by the fare, which 
gives the function 


r(f) = (a—bf)f. (3) 


For insight into what’s going on, figure m shows this function in 
the case where a = 100 and b = 1. When the fare is zero, we get 
plenty of customers every day, but they don’t pay anything, so our 
revenue is zero. When the fare is 100 francs, the number of paying 
passengers goes down to zero, so again we have no revenue. 


Somewhere in between these extremes we have the fare that 
would optimize our revenue: the maximum of the function r. At 


24 Chapter 1 An informal introduction to the derivative 


this point on the graph, the derivative is zero, so to find it, we should 
differentiate r, set it equal to zero, and solve for f. 


We haven’t yet learned enough of the techniques of calculus to 
know how to find the derivative of a function with the form of equa- 
tion (3), but by multiplying out the product we can make it into a 
polynomial, which is a form that we do know how to differentiate: 


r(f) =—bf? + af 
r'(f) =—2bf +a 


Setting r’ equal to zero, we have 


0=—-2bf+a 
et 
9b 


With the particular numerical values used to construct the graph, 
this gives an optimal fare of 50 francs, which looks about right from 
the graph. 


By searching for points where the derivative is zero we can of- 
ten, but not always, find the the points where a function takes on 
its maximum and minimum values. The term extremum (plural ex- 
trema) is used to refer to these points. Figure n shows that quite a 
few different things can happen, and that searching for a zero deriva- 
tive doesn’t always tell us the whole story. We have a zero derivative 
at point G, but G is only a maximum compared to nearby points; 
we call G a local maximum, as opposed to the global maximum D. 
The zero-derivative test doesn’t distinguish a local minimum like B 
from a local maximum. A zero derivative may not indicate a local 
extremum at all, as at C and H. We can have points such as E and 
F where the derivative is undefined. An extremum can occur at a 
point like A that is the endpoint of the function’s domain.! We 
will come back to these technical points in more detail later in the 
book.? 


1.6 Review: elementary properties of the real 
numbers 


I began this chapter by defining calculus as the study of rates of 
change, but it could equally well be described as the study of in- 
finity. The intuition behind the derivative is that we zoom in on 
a selected point on a smooth curve, until the curve appears like a 
line and we can measure the slope of the line. But the curve won’t 
appear perfectly straight until we’ve cranked up our microscope to 
an infinitely big magnification, at which point we’ll be seeing values 





'For a more thorough review of notions such as the domain of a function, see 
section 5.5, p. 131. 
?section 3.4.1, p. 86 





n/A_ zero derivative — often, 
but not always, indicates a local 
extremum. Sometimes we have 
a zero derivative without a local 
extremum, and sometimes a local 
extremum with an undefined or 
nonzero derivative. 





o/The railroad tracks stretch 
toward a vanishing point at 
infinity. Are there infinitely big or 
infinitely small numbers? 
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1.41421... 
10401046261 ... 
p/Simon Stevin (1548-1620) 


was a Flemish mathematician 
and engineer who lived a cen- 
tury before the invention of the 
calculus. He wrote a book on 
decimals, using a notation some- 
what different from the modern 
one. (The figure shows the mod- 
ern notation and Stevin’s notation 
for the decimal expansion of 2.) 
Stevin’s decimals represent an 
alternative approach to defining 
what we mean by a real number: 
rather than defining them by 
listing their properties, we can 
define them by constructing them 
out of simpler objects (decimal 
digits). Stevin argued for allowing 
any arbitrary, infinite string of 
digits, which is equivalent to 
including all the real numbers 
but forbidding infinitely big and 
infinitely small numbers. 
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of Ax and Ay that are infinitely small (but not zero). Calculus 
was invented by Isaac Newton and Gottfried Wilhelm von Leibniz 
back in the era of powdered wigs and silk stockings, and in those 
days the concept of “number” was still in the process of being stan- 
dardized and formalized.? Newton and Leibniz found it convenient 
to work with symbols representing infinitely big and infinitely small 
numbers, and a debate ensued about whether it was all right to call 
those things “numbers.” 


Today we think about this kind of thing in a different way. De- 
cisions about what to allow as a legal number are thought of not 
as matters of right and wrong but as definitions. We define certain 
sets of numbers, including: 


the integers: whole numbers such as —1, 0, and 1 

the rational numbers: ratios of integers such as 2/1 and 3/4 
the real numbers, including quantities like 7 and //2 

the complex numbers, such as /—1 


Do these systems contain infinitely big and infinitely small num- 
bers? Can they? Should they? 


To answer these questions, we need to give a more definite ac- 
count of how these number systems are defined. One good way to 
define them is with a list of their axioms. (For an alternative, con- 
structive approach, see figure p.) Here is a list of axioms for the 
system of real numbers. Except as otherwise stated, each of these 
properties holds for any real-number values of the symbols x, y, ... 


commutativity «+ y=y+2 and ry = yx 


identities There exist numbers 0 and 1 such that for any xz, +0 = 
x and lx =z. 


inverses For any z, there exists a number —z such that 7+ (—2z) = 
0. For any nonzero «, there exists 1/x such that (#)(1/x) = 1. 


associativity «+ (y+ 2) = (x+y) +2 and 2(yz) = (ry)z 
distributivity «(y+ z) =xy+ xz 


ordering We can define whether or not x < y, and this ordering 
relates to the addition and multiplication operations in specific 
ways, which you’ve seen defined in a previous course on algebra 
and which for brevity we will not explicitly give here. 





°For more on the history, see Blaszczyk, Katz, and Sherry, “Ten misconcep- 
tions from the history of analysis and their debunking,” arxiv.org/abs/1202. 
4153. 
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This list of axioms holds for the real numbers, but it fails for 
the integers, since for example the integer 2 doesn’t have an inverse 
that is an integer. It also fails for the complex numbers, which don’t 
have a well-defined ordering. The list seems detailed and precise, so 
it may come as a surprise that it does not suffice to prove anything 
about whether or not infinite numbers exist. The list of axioms 
is in fact not enough to characterize the real numbers. Later in 
this book we will add another axiom, called the completeness axiom 
(section 4.5, p. 111), to the list. The completeness axiom holds for 
the reals but not the rationals, and it also rules out the existence 
of infinitely large or infinitely small real numbers. It is possible to 
extend the real number system to a larger one that does include 
infinities (section 2.9, p. 64). 


1.7 The Leibniz notation 
1.7.1 Motivation 


Lacking the more precise modern ideas described in section 1.6, 
Leibniz argued as follows. Let’s just make Ax and Ay infinitely 
small (but not zero). In modern terminology, this means that they 
can’t be real numbers. To make it clear that we’re talking about 
infinitely small differences in x and y, we change the notation to dx 
and dy. Recall that A is the Greek version of capital “D,” so we’re 
using a smaller version of the letter, “d,” to represent a change that 
is smaller (in fact, infinitely small). Dividing these two “numbers” 
(whatever mysterious species of number they may turn out to be), 
we get the derivative, 


dy 
dz’ 


Although the notation’s original justification was not up to modern 
standards of rigor, it is one of the most expressive and well-designed 
mathematical notations ever devised, and has been the most com- 
monly used notation for the derivative ever since Leibniz published 
it in 1686. Around 1970, mathematicians clarified some of these 
issues and essentially justified and codified the centuries-old proce- 
dures for manipulating the dy’s and dz’s; section 2.9 on p. 64 boils 
these modern developments down to a simple set of practical rules. 


1.7.2 With respect to what? 


One of the good things about the Leibniz notation is that it 
states clearly what we’re differentiating with respect to. For example, 
dv/dt could indicate how much a car was speeding up with each 
passing second of time, while du/ dz would measure the speed gained 
with each meter that it moved down the road. 


q / Gottfried 
(1646-1716). 


Wilhelm 
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>Box 1.4 The SI 


The metric system is the 
system of units used universally 
in engineering and the sciences, 
as well as in daily life in ev- 
ery country except the United 
States. Formally known as the 
Systeme International (SI), it 
was invented during the French 
Revolution. For mechanical (as 
opposed to electrical) measure- 
ments, the SI uses three basic 
units: 


meters for length 
kilograms for mass 
seconds for time 


Other measurements are built 
from these, e.g., meters per sec- 
ond (m/s) for velocity. 


There is a system of prefixes 
that represent powers of ten in 
which the exponent is a mul- 
tiple of three. The most com- 


mon of these are kilo- = 10°, 
andl 1007. (hhespre- 
fix centi- = 107? is used only 


in the centimeter, and doesn’t 
require memorization since we 
know that dollars and euros are 
subdivided into 100 cents.) 
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1.7.3 Shows units 
Another selling point of the notation is that it shows the units 
of the derivative. For example, the definition of velocity, expressed 
in Leibniz notation, is 
_ dx 
Oe 
On the left-hand side we have velocity, whose units in the SI are 
meters per second. On the right we have a tiny change in position, 
which has units of meters, divided by a tiny change in time, which 
has units of seconds. In terms of units, then, the equation reads as 


m 
m/s = 3 

which works out correctly. In more complicated examples, checking 

the units like this is a powerful method for checking your answer to 

a calculus problem. 


Burning gasoline Example 7 
> Let x be a car’s odometer reading and g the amount of gasoline 
burned since the odometer was zeroed. One can think of x as 
a function of g. Many cars have a digital display that shows the 
function x’(g) in real time. Express this using the Leibniz notation. 
What is the interpretation of this derivative, and what units does 
it have? 


> The Leibniz notation is dx/dg, which makes it clear that the 
units are kilometers per liter, km/L (or, in U.S. units, miles per 
gallon). The interpretation is that this number gives a measure 
of how efficient the car is at using fuel to transport you a given 
distance. 


An insect pest Example 8 
> An insect pest from the United States is inadvertently released 
in a village in rural China. The pests spread outward at a rate 
of s kilometers per year, forming a widening circle of contagion. 
Find the number of square kilometers per year that become newly 
infested. Check that the units of the result make sense. Interpret 
the result. 


> Let t be the time, in years, since the pest was introduced. The 
radius of the circle is r = st, and its area is a = mr° = n(st)?. 
To make this look like a polynomial, we have to rewrite it as a = 
(7ts*) t?. The derivative is 


The units of s are km/year, so squaring it gives km?/year?. The 2 
and the 7 are unitless, and multiplying by t gives units of km?/year, 
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which is what we expect for da/ df, since it represents the number 
of square kilometers per year that become infested. 


Interpreting the result, we notice a couple of things. First, the rate 
of infestation isn’t constant; it’s proportional to t, so people might 
not pay so much attention at first, but later on the effort required 
to combat the problem will grow more and more quickly. Second, 
we notice that the result is proportional to s*. This suggests that 
anything that could be done to reduce s would be very helpful. 
For instance, a measure that cut s in half would reduce da/ dt by 
a factor of four. 


A whirling bucket Example 9 
> Figure r shows a bucket full of water that is being whirled rapidly, 
so that the water spreads out from the center. The surface of the 
water forms a parabola with the equation 

x2 


ae 


where c is a constant. Infer the units of c, find the slope of the 
water’s surface, and check the units of your answer. 


> Both x and y are measured in units of meters, so we have 
m2 
m= ——_—.. 
units of c 


If the units of the left and right sides are to be equal, c must have 
units of meters as well. 


Differentiation gives the slope of the water's surface as r/ Example 9. 
dy 2x 
dx ¢° 


where the factor of 1/c “comes along for the ride,’ as with any 
multiplicative constant. 


Checking the units of the result, we have 


m_ (unitless) -m 


m m 





which checks out. 


1.7.4 Operator interpretation 


Sometimes the Leibniz notation gives an unwieldy, top-heavy 
tower of symbols: 


dx ze 


One way to avoid this awkwardness is to revert to the “prime” no- 
tation: 
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But a more common solution is write the function being differenti- 
ated over on the right: 

d (x? 1 + 

de (S + "| = 


This can be seen simply as a typographical expedient, or it can be 
given a mathematical interpretation: we can think of or as meaning 
“take the derivative of,” in the same way that ./— means “take the 


square root of.” We call a the operator describing the operation of 
taking a function and giving back the function that is its derivative. 
Math teachers who dislike the historical connotations of the Leibniz 
notation in terms of infinitely small numbers will sometimes present 
the operator interpretation as the only correct interpretation, but 
such a prescription robs the student of some of the utility of the 
notation, e.g., by making it impossible to do the kind of reasoning 
shown in example 8. 


1.8 Approximations 


We saw in section 1.5.2 on p. 23 that the derivative can’t be cal- 
culated as Ay/Az unless the derivative is constant, i.e., unless the 
function’s graph is a line. In the Leibniz notation, this is 


But if we take two points very close together on a graph, then 
the curvature doesn’t matter too much, and the line through those 
points is a good approximation to the tangent line, as in figure s. 
When then have the approximation 

dy Ay 

da Aa’ 
It may be of interest to use either side of this as an approximation 
to the other. 


1.8.1 Approximating the derivative 


Suppose you can’t remember that the derivative of x? is 2x, but 
you need to find the value of the derivative at x = 1. As in figure s, 
let point P be 

(1.0000, 1.0000), 
and let Q be the nearby point 

(1.0100, 1.0201). 
We then have: 





_ 1.0201 — 1.0000 


~~ 1.0100 — 1.0000 
s/The dotted line through P 0.0201 


and Q is a good approximation to a 
the tangent line through P. 0.0100 
= 2.01 
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This is quite a good approximation to the exact answer, 2. If we 
needed a better approximation, we could take Q even closer to P. In 
reality we would use this technique in cases where we didn’t know 
the exact answer, and we would then want to know how accurate 
our result was. To do this, we could redo the calculation with a 
smaller value of Ax, say 0.001, and look for the most significant 
decimal place that changed. 


1.8.2 Approximating finite changes 


Sometimes we know the derivative and want to use it as an 
approximation to find out about finite changes in the variables. For 
example, the Women’s National Basketball Association says that 
balls used in its games should have a radius of 11.6 cm, with an 
allowable range of error of plus or minus 0.1 cm (one millimeter). 
How accurately can we determine the ball’s volume? 


The equation for the volume of a sphere gives V = (4/3)mr3 = 
6538 cm? (about six and a half liters). We have a function V(r), 
and we want to know how much of an effect will be produced on 
the function’s output V if its input r is changed by a certain small 
amount. Since the amount by which r can be changed is small 
compared to r, it’s reasonable to apply the approximation 


AV av 
Ar dr’ 


which gives 


(Note that the factor of 47r? can be interpreted as the ball’s surface 
area.) Plugging in numbers, we find that the volume could be off 
by as much as (47r?)(0.1 cm) = 170 cm’. The volume of the ball 
can therefore be expressed as 6500 + 170 cm®, where the original 
figure of 6538 has been rounded off to the nearest hundred in order 
to avoid creating the impression that the 3 and the 8 actually mean 
anything — they clearly don’t, since the possible error is out in the 
hundreds’ place. 





This calculation is an example of a very common situation that 
occurs in the sciences, and even in everyday life, in which we base 
a calculation on a number that has some range of uncertainty in 
it, causing a corresponding range of uncertainty in the final result. 
This is called propagation of errors. The idea is that the derivative 
expresses how sensitive the function’s output is to its input. 


The example of the basketball could also have been handled 
without calculus, simply by recalculating the volume using a radius 
that was raised from 11.6 to 11.7 cm, and finding the difference 
between the two volumes. Understanding it in terms of calculus, 





aaa 


11.6+.1 cm 


t/How accurately can 
determine the ball’s volume? 
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— actual 
a school rule 
--— tangent line 






car lengths 
Re 
oO 


20 40 60 80 
miles per hour 


u/Stopping distance in car 
lengths, as a function of initial 
speed in miles per hour. The 
stopping distances were mea- 
sured using professional drivers 
on a track. I’ve defined a car 
length as 4.8 meters, which is the 
length of a Honda Accord. The 
dotted line shows the traditional 
rule taught in schools in the US, 
one car length per 10 m.p.h. of 
speed. The dashed line is the 
tangent at 60 miles per hour, 
which is the best linear approxi- 
mation for speeds near this one. 
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however, gives us a different way of getting at the same ideas, and 
often allows us to understand more deeply what’s going on. For 
example, we noticed in passing that the derivative of the volume was 
simply the surface area of the ball, which provides a nice geometric 
visualization. We can imagine inflating the ball so that its radius 
is increased by a millimeter. The amount of added volume equals 
the surface area of the ball multiplied by one millimeter, just as the 
amount of volume added to the world’s oceans by global warming 
equals the oceans’ surface area multiplied by the added depth. 


As another example of an insight that we would have missed if 
we hadn’t applied calculus, consider how much error is incurred in 
the measurement of the width of a book if the ruler is placed on the 
book at a slightly incorrect angle, so that it doesn’t form an angle of 
exactly 90 degrees with spine. The measurement has its minimum 
(and correct) value if the ruler is placed at exactly 90 degrees. Since 
the function has a minimum at this angle, its derivative is zero. That 
means that we expect essentially no error in the measurement if the 
ruler’s angle is just a tiny bit off. This gives us the insight that it’s 
not worth fiddling excessively over the angle in this measurement. 
Other sources of error will be more important. For example, is the 
book a uniform rectangle? Are we using the worn end of the ruler 
as its zero, rather than letting the ruler hang over both sides of the 
book and subtracting the two measurements? 


1.8.3 Linear approximation to a curve 


Many people who, like me, learned to drive in the United States 
were taught that when following another car, we should leave space 
equal to one car length for every 10 miles per hour of speed. This rule 
has the advantage of being easy to compute in your head while you’re 
on the freeway, but figure u shows that it’s a poor approximation. 
This is an example of a situation that occurs over and over again in 
real life, which is that we would like to approximate a complicated 
nonlinear function using a simple linear one. The derivative is the 
slope of the tangent line, and the tangent line is the best possible 
line to approximate a given function near a particular point. 


Here is a general procedure for finding the best linear approxi- 
mation to a nonlinear function: 


1. Pick some point on the graph that is near the center of the 
region for which we’re interested in getting a linear approxi- 
mation. 


2. Differentiate the function to find the slope of the tangent line 
through this point. 


3. Given a point on a line and the line’s slope, we can find the 
equation of the line. One way to do this is to write down the 
definition of the slope as Ay/Az. 
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Ice cream Example 10 
> Fred drives an ice cream truck in Deadhorse, Alaska, where 
the average temperature in the summer is about 10 degrees Cel- 
sius. During the long Arctic winter nights, Fred has developed a 
mathematical model showing that his daily revenue y in dollars is 
related to the Celsius temperature x by the equation 


y = —800 + 100x — x?. 


Find a useful linear approximation to this equation. 


> Since the average temperature in summer is about x = 10, let’s 
find the best linear approximation near this point. Differentiation 
gives y’ = 100 — 2x, and plugging in x = 10 gives a slope 


AY _ go, 


| f th li 
Ax [slope of the tangent line] 


If we plug in the value x = 10 to the equation for y itself, we find 
that the point 


(10, 100) [a point on the tangent line] 


is the one that we're trying to find the tangent line through. We 
therefore have 





80 [point-slope form of the line] 


for the equation of the best linear approximation. Fred is inter- 
ested in calculating his profits y, so he solves this for y to find 
y = —700 + 80x. As an approximation to the true (nonlinear) 
function, this is 


y = —700 + 80x. [slope-intercept form] 
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1.9 More about units 


In section 1.7.3 on p. 28, we briefly discussed the idea of checking 
your calculus by analyzing the units of measurement. If you had a 
good high school chemistry or physics course, you may have already 
learned how to do this to check your algebra. If not, then you may 
find it helpful to study this section, which lays out the ideas in more 
detail. 


Figure v shows a cute snake, along with its even cuter geomet- 
rical idealization as a rectangular box. The snake has 


length @, in units of meters (m) 
width w, in units of meters (m), 
mass M, in units of kilograms (kg). 
(Some people would say “in units of length,” and “in units of mass,” 


but to be more concrete I’m using the SI units listed in box 1.4 on 
p. 28.) 


It makes sense to manipulate these quantities in certain ways: 











dw, the snake’s waistline, 
wl, its volume in cubic meters (m°), 
v/A snake approximated as M 


a box. its density in kg/m’, 


wee’ 
or 
2w+é<1.14 m, 


which tells us whether this snake is legal as carry-on luggage. 


But some combinations don’t make sense: 


£+M can’t add meters to kilograms 
wl = wel can’t equate area to volume 


cos IZ can’t take the cosine of a mass 


Some quantities are unitless. I have two dogs, and the 2 is a 
unitless 2; in general, a count is unitless. When we form a ratio 
between two numbers that have the same units, the result is unitless. 
For example, the rectangular snake in the figure has 0/w = 12.6, 
which is unitless; one way to tell that it’s unitless is that if we 
enlarge or reduce the drawing, the quantities that have units grow 
or shrink, but the proportions such as ¢/w stay the same. 


The following rules apply: 


1. In addition, subtraction, and comparisons, all terms must have 
the same units. 


2. When you multiply or divide numbers, multiply or divide their 
units as well. 
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3. All the functions on your calculator that go beyond grade- 
school arithmetic require a unitless input and give a unitless 
output. These functions include logs, exponentials, and trig 
functions, and are referred to collectively as transcendental 
functions (sec. 5.1.2, p. 126). 


Radians aren't units Example 11 
Using the notation shown in figure b, the radian measure of the 
angle 0 is defined as s/r. The arc length s and radius r both 
have units of meters, so by rule 2 their ratio is unitless. Therefore 
radians are not really a unit. This is required by rule 3 so that we 
can use them as inputs to trig functions. 


Cosine is unitless Example 12 
The cosine is adjacent/hypotenuse, so it’s unitless, as required 
by rule 3. 


Frequency Example 13 
The period T of a vibration is defined as time it takes to go through 
one cycle. The frequency is defined as f = 1/7, and by rule 2 it 
has units of 1/seconds or s~! (also known as Hz). 


Area, or volume? Example 14 
> You remember that 47r2 is the formula either for the volume of 
a sphere or for its surface area, but you can’t remember which it 
is. Which one does it have to be based on units? 


> The 47: is unitless. By rule 2, the expression 47r? thus has units 
of m2, i.e., square meters, or area. 


Square roots Example 15 
A square root is not a transcendental function, so rule 3 doesn’t 
apply to it. For example, our snake has a cross-sectional area 
A = w?. We then have w = V/A, and it’s OK to feed the square 
root function a unitful input: m = Vm2. 


No units in the exponent Example 16 
> We can compute w?, where w has units. Does that mean we 
can also calculate 2”? 


> No, because then 2” = e!™2") = e”'!N2: but then the input to 
the exponential would have units, violating rule 3. |.e., the base-2 
exponential is transcendental, just like the base-e flavor. 


Radioactivity Example 17 
> As aradioactive substance decays, the fraction of it that remains 
after time t is given by f = e~'/, where k is a constant. Infer the 
units of k. 


> By rule 3, t/k must be unitless, so k is in seconds. 
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Pressure (in millibars, mb) versus 
temperature (in degrees Kelvin, 
K) of the atmosphere of Jupiter, 
problem a5. For comparison, the 
atmospheric pressure and tem- 
perature at the earth’s surface are 
about 1000 mb and 300 K. Al- 
though Jupiter is in the outer so- 
lar system and is in general very 
cold, the temperatures in its tenu- 
ous upper atmosphere are, coun- 
terintuitively, very hot; this feature 
of the graph is what would be re- 
ferred to on earth as an “inversion 
layer.” Seiff et al., J. Geophys. Re- 
search 103 (1998) 22,857. 
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Review problems 
al A line with slope 3 passes through the point (—7,1). Find 


an equation for the line, solving for y. Vv 
a2 A line passes through the points (2,3) and (6,5). (a) Find 
the slope. (b) Write an equation for the line, solving for y. = V 
a3 A line has the equation 4a —3y+1=0. Find its slope. Vv 


a4 A line has the equation ax + by +c = 0. If x changes by an 
amount Az, find the amount Ay by which y changes. Vv 


ad The figure shows data on the pressure p and temperature 
T of the planet Jupiter, as measured by the Galileo probe in 1995. 
Can p be described as a function of T’?? Can T be described as a 
> Solution, p. 224 


function of p? 








p, mb 











a6 Suppose that a line is expressed as an equation in the form 
(...Ja+(...)y+(...) =0, where the (...) stand for constants. Under 
what conditions does y fail to be a function of x? 

> Solution, p. 224 


a7 Let x and y be real numbers. Which of these equations make 
y a function of x? 


Yaa 2 — y? — 7 r=y 


> Solution, p. 224 
a8 Let S = {ulu? — 2u < 0}. Figure out what set of points is 


really being described here, and rewrite this as a simpler definition 
of the form S = {...|...}. > Solution, p. 224 


An informal introduction to the derivative 


Problems 


cl Differentiate the following functions with respect to t: 
We Catty i Voto Tied. > Solution, p. 224 


c2 The functions f and g are defined by 
f(x) = x? and g(s) = 8”. 


Are f and g the same function, or are they different? 
> Solution, p. 225 


c3 Let m be an amount of money. There are many examples 
from business, personal finance, and government in which it makes 
sense to imagine that m is a function of time, m(t). Make up an 
example in which m(t) = 0 but m’(t) # 0. (Don’t make up an 
equation, just explain a situation where this would happen and how 
it would be interpreted.) > Solution, p. 225 


c4 A seller offers something at a unit price P, and the quantity 
of units sold is Q. Ordinarily, we expect that P and Q would be 
related in some way that could be expressed by a graph, but there’s 
no obvious way to decide which variable, P or Q, should be on which 
axis. The cause-and-effect relationship isn’t clearly one way or the 
other: a change in price could cause a change in demand, but a 
change in demand could also prompt the seller to change the price. 
The graph is called the demand curve. 


For some unusual goods, the demand is insensitive to the price. 
For example, the drug Soliris treats a genetic disease so rare that 
only about 8,000 people in the U.S. have it. The price P is about 
$400,000 per patient per year. Since the benefits of treatment for 
these people are so great, and the cost is paid for by government or 
private insurers, changing P would not change Q. (a) How would 
this example look on a graph if we put P on the y axis and Q on the 
x axis? What if we did it the other way around? (b) In each case, 
discuss whether the graph is a function. (c) In each case, what can 
you say about the derivative based on the the informal definition 
given in section 1.2.1? 


In problems d1-d5, a function is defined by giving an equation for y 
in terms of x. Find the derivative of the function. 


dl y = 304-22? +a2+4+1 Vp Solution, p. 225 
d2 0 y=—Ta? +2? -—7a-7 v 
d8 y= 22° + 324 — 2° + 137 v 
d4 y=1le'! — 474+ 27-8 v 

J 


d5 y = 302 +2¢-1 
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Problems e2-e5 are each intended to be assigned randomly to one 
fourth of the students in a class. 


el Differentiate 3z’ — 42? + 6 with respect to z. Check your 
answer by picking an arbitrary value of z and applying the technique 
described in section 1.8.1, p. 30. > Solution, p. 225 


e2 Differentiate 4g? + 4q — 1 with respect to g. Check your 
answer by the same technique as in problem el. v 


e3 Differentiate —11w* + 5w?+6 with respect to w. Check your 
answer by the same technique as in problem el. Vv 


e4 Differentiate c°’ — 18c? + 987 with respect to c. Check your 
answer by the same technique as in problem el. Vv 


e5 Differentiate 10r!° — 6r®° + 7 with respect to r. Check your 
answer by the same technique as in problem el. v 


e6 Find three different functions whose derivatives are the con- 
stant 7, and give a geometrical interpretation. 
> Solution, p. 225 


fl Let the function y be defined by y(x) = pa? — qx +r, where 
p, q, and r are constants. Find y’(x). Jv 


f2 Let the function h be defined by h(u) = au? — # +c, where 
a, b, and c are constants. Find h’(u). v 
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In problems f3-f5 you will need to start by rewriting the given ex- 
pressions in a form that you know how to differentiate. (If you’ve 
had some previous exposure to calculus, you may already know the 
product rule or chain rule. Some of these problems can be done us- 
ing those rules, but they can also be done without them. If you use 
them, explain that you’re doing so.) 


{3 Let the function f(x) be defined by f(x) = (a + 1)(2x + 3). 
Find f’(z). v 


f4 Let the function q be defined by q(c) = (2c?)(7c). Find 
/ 
q'(c). Me 


\2 
f5 ~~ Let the function z be defined by z(j) = (aj)*—7 (2) , where 


a and r are constants. Find z'(j). v 


£6 Let the function f(a) be defined by 


git 
f(z) = We 
where m #4 —1 is a constant. Find f’(z). v 


gl Consider the function f defined by f(x) = |2|. 

(a) Sketch its graph. If you’re not sure what it would look like, try 
to gain insight by calculating points for a few values of x, including 
values that are positive, negative, and zero. 

(b) On p. 14 I gave an informal definition of the tangent line and 
the derivative in terms of zooming in on a graph. Does this function 
have a well-defined tangent line at x = 0? A well-defined derivative? 
(c) On p. 16 I defined a special type of tangent line called a no-cut 
line, and the definition requires that the no-cut line be unique, i.e., 
there is not more than one line with the given properties. Is there 
a no-cut line at x = 0 for this function? 


g2 Consider the function f defined as follows: 
0 ifx<0 


a ifx>0 


(a) Sketch its graph. If you’re not sure what it would look like, try 
to gain insight by calculating points for a few values of x, including 
values that are positive, negative, and zero. 

(b) On p. 14 I gave an informal definition of the tangent line and 
the derivative in terms of zooming in on a graph. Does this function 
have a well-defined tangent line at x = 0? A well-defined derivative? 
(c) On p. 16 I defined a special type of tangent line called a no-cut 
line, and the definition requires that the no-cut line be unique, i.e., 
there is not more than one line with the given properties. Is there 
a no-cut line at x = 0 for this function? 
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g3 Consider the function f defined by f(a) = la]. 

(a) Sketch its graph. If you’re not sure what it would look like, try 
to gain insight by calculating points for a few values of x, including 
values that are positive, negative, and zero. For insight, try a very 
small value of x such as 107°; think about how f(x) compares with 
x for this small x, and what this tells you about the shape of the 
graph near x = 0. 

(b) On p. 14 I gave an informal definition of the tangent line and 
the derivative in terms of zooming in on a graph. Does this function 
have a well-defined tangent line at x = 0? A well-defined derivative? 
(c) On p. 16 I defined a special type of tangent line called a no-cut 
line, and the definition requires that the no-cut line be unique, i.e., 
there is not more than one line with the given properties. Is there 
a no-cut line at x = 0 for this function? 


il Differentiate at? + bt + c with respect to t. 
[Thompson, 1919] b> Solution, p. 226 


i2 Let the function f be defined by f(x) = 3x" + Ex = 3. Find 


the value of x for which f’(x) = 3. v 
i3 The variables u and r are related by u = ar? — zr + 3. Find 


the value of r that minimizes wu. 
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i4 Recall that the range of a function is the set of possible values 
its output can have. Find the ranges of the following functions. 


f(x) = 22? +3 

g(x) = —2a7 + 4a 
A(x) = 42 +2? 

k(x) = 1/(1 + 2”) 

&(x) = 1/(3 + 2x + x”) 





m(x) = 4sinx + sin? x 


(For m, if you’ve forgotten your trig you may wish to review from 
section 5.3, p. 128. It is possible to do this problem without knowing 
how to differentiate the sine function.) 


You will find it convenient to express some of your answers using 
notations such as [17,00), which is a standard way of extending 
the normal notation for finite intervals (p. 15) to describe infinite 
ones. This example means, as you’d imagine, the set {u|u > 17}. 
Although oo isn’t a real number, the notation gets the idea across. 
The use of the ) rather than a | is to show that there isn’t a member 
of the set whose value is infinite. 


Although you may be able to guess some of the answers by con- 
structing a graph, that does not constitute a proof of the exact 
result. 


i5 Consider the following four functions: 
f(x) =27 -—2e4+n 
g(u) = u8 — Qu9 + 
A(v) = In(v? — 2v + 7) 
k(w) = tan? w — 2tanw+7 
Determine the minimum value of each function. 


Although you may be able to get approximations to the answers by 
graphing, that does not constitute a proof of the exact result, which 
is what is required here. You may, however, find it helpful to check 
your exact results using graphing, e.g., on the online graphing app 
at desmos.com. 


If you’ve forgotten some of your precalculus mathematics, you may 
wish to review trig from section 5.3, p. 128 and logarithms from 
section 5.7, p. 134. It is possible to do this problem without knowing 
how to differentiate the functions In and tan; instead, reason about 
how the inputs and output of the functions work, and think about 
how the construction of functions h and & relates them to functions 


f and g. Vv 
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k1 Children grow up, but adults more often grow in the hor- 
izontal direction. Suppose we model a human body as a cylinder 
of height h and circumference c. The person’s body mass is given 
by m = pv, where v is the volume and p (Greek letter rho, the 
equivalent of Latin “r”) is the density. Find dm/dc, the rate at 
which body mass grows with waistline, assuing constant height and 
density. Check that your answer has the right units, as in example 
8 on p. 28 and section 1.9 on p. 34. Vv 


k2 Let ¢ be the time that has elapsed since the Big Bang. In 
that time, one would imagine that light, traveling at speed c, has 
been able to travel a maximum distance ct. (In fact the distance is 
several times more than this, because according to Einstein’s theory 
of general relativity, space itself has been expanding while the ray of 
light was in transit.) The portion of the universe that we can observe 
would then be a sphere of radius ct, with volume v = (4/3)rr? = 
(4/3)(ct)?. Compute the rate dv/dt at which the volume of the 
observable universe is increasing, and check that your answer has 
the right units, as in example 8 on page 28 and section 1.9 on p. 34. 
Hint: We’re differentiating with respect to t, and the thing being 
cubed is not just t, so this is not a form that you know how to 
differentiate. Use algebra to convert it into a form that you do 
know how to handle. v 


k3 Kinetic energy is a measure of an object’s quantity of mo- 
tion; when you buy gasoline, the energy you’re paying for will be 
converted into the car’s kinetic energy (actually only some of it, 
since the engine isn’t perfectly efficient). The kinetic energy of an 
object with mass m and velocity v is given by K = (1/2)mv?. 

(a) As described in box 1.4 on p. 28, infer the SI units of kinetic 
energy. 

(b) For a car accelerating at a steady rate, with v = at, find the 
rate dk/dt at which the engine is required to put out kinetic en- 
ergy. dk/ dt, with units of energy over time, is known as the power. 
Hint: We’re differentiating with respect to t, and the thing being 
squared is not just ¢t, so this is not a form that you know how to 
differentiate. Use algebra to convert it into a form that you do know 
how to handle. Vv 
(c) Check that your answer has the right units, as in example 8 on 
page 28 and section 1.9 on p. 34. 


ml Section 1.2.3 on p. 16 defines the addition and vertical 

stretch properties of the derivative. If we assume that the addition 

property is true, prove that the vertical stretch property must hold 

for any stretch factor r that is a natural number (1, 2, 3, ...). 
> Solution, p. 226 


m2 Section 1.2.3 on p. 16 defines the constant and line properties 
of the derivative. Prove that the constant property follows from the 
line property. 
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m3 Section 1.2.3 on p. 16 defines the addition, constant, and 
vertical shift properties of the derivative. If we assume that the 
addition and constant properties are true, prove that vertical shift 
property must hold. 


m4 An even function is one with the property f(—2) = f(z). 
For example, cos x is an even function, and x” is an even function if 
n is even. An odd function has f(—x) = —f(«). Use the horizontal 
flip property of the derivative (p. 16) to prove that the derivative of 
an even function is odd. 


nl Rancher Rick has a length of cyclone fence LZ with which 

to enclose a rectangular pasture. Show that he can enclose the 

greatest possible area by forming a square with sides of length L/4. 
> Solution, p. 226 


n2_~_—swProve that the total number of maxima and minima possessed 
by a third-order polynomial is at most two. p> Solution, p. 226 


n3 A factory produces widgets, and the cost of production for 
a given year is an + bn”, where n is the number produced, a is the 
basic cost of producing one widget, and } represents the fact that in 
order to increase volume, the factory must take expensive steps such 
as adding a night shift, paying overtime, or offering higher wages in 
order to attract more and better workers. The widgets are sold at 
a fixed unit wholesale price k, and there is unlimited demand. 

(a) Find the optimal number of widgets that the factory should 
produce. Vv 
(b) Check that your answer has the right units, as in example 8 on 
page 28 and section 1.9 on p. 34. 

(c) Interpret the case where b = 0. 

(d) Interpret the case where k < a. 


n4 A steel sphere of radius r is dropped into an upright cylinder 
of radius b > r. For a fixed value of 6, find the value of r that 
maximizes the amount of water that needs to be poured into the 
cylinder in order to cover the sphere. v 


Problems p1-p3 are each intended to be assigned randomly to one 
third of the students in a class. 


pl A circle has area a, diameter d, and radius r. Express a in 
terms of r, d in terms of r, and a in terms of d. Find the derivatives 
da/ dr, dd/ dr, and da/ dd. The Leibniz notation suggests that we 


should have 
da _ dadd 


dr dddr’ 
Is this actually true? 
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p2  Asphere has volume v, diameter d, and radius r. Express v in 
terms of r, d in terms of r, and v in terms of d. Find the derivatives 
dvu/ dr, dd/ dr, and dv/dd. The Leibniz notation suggests that we 


should have 
dv = dv dd 


dr dddr’ 
Is this actually true? 


p3 An equilateral triangle has sides of length s, perimeter p, 
and area a. Express a in terms of p, p in terms of s, and a in terms 
of s. Find the derivatives da/ dp, dp/ds, and da/ds. The Leibniz 
notation suggests that we should have 


da _ dadp 
ds dpds’ 
Is this actually true? 
ql As a tree grows in height h, it gains mass m, so that we have 


some function m(h). If h is measured in units of meters, and m in 
kilograms, what are the units of the changes Am and Ah and of the 
derivative dm/ dh? 


q2___ A tank is filling with water. The volume (in cubic meters) of 
water in the tank at time t (seconds) is V(t). What units does the 
derivative V’(t) have? 


rl Use the technique in section 1.8.1 to obtain a numerical 
approximation to the derivative of the function y = 1/(1 — x) at 
x = 0. Find an answer accurate to three decimal places. 

> Solution, p. 226 


r2 Use the technique in section 1.8.1 to obtain a numerical 
approximation to the derivative of the function y = cos(a#°) at x = 1. 
Find an answer accurate to three decimal places. v 

r3 Use the technique in section 1.8.1 to obtain a numerical 
approximation to the derivative of the function y = sin ,/z at x = 1. 
Find an answer accurate to three decimal places. v 

r4 Use the technique in section 1.8.1 to obtain a numerical 
approximation to the derivative of the function y = e©%* at 7 = 1. 
Find an answer accurate to three decimal places. v 

rd A function of the form U = 1/(1 +e”) occurs in nuclear 


physics, and its derivative is interpreted as the force acting on a 
neutron or proton when it is at a distance r from the center of the 
nucleus. Use the technique in section 1.8.1 to obtain a numerical 
approximation to the derivative of this function at r = 1. Find an 
answer accurate to three decimal places. Vv 
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sl Suppose that we measure a quantity x and compute from it 
y = kx”, where & is a constant and n is a natural number. Let Ax 
be an estimate of the amount of possible measurement error in x, 
and let Ay be the corresponding error estimate for the output of 
the calculation. 
(a) Show that if Az is small compared to x, then 

Ay Ax 

— sn. 

y x 
(b) Vernier calipers are used to measure the length of the sides 
of a square tile to a precision of 0.1%. Use the result of part a 
to find the possible error in an area computed from this length. 

> Solution, p. 227 


s2 A hobbyist is going to measure the height to which her model 
rocket rises at the peak of its trajectory. She plans to take a digital 
photo from far away and then do trigonometry to determine the 
height, given the baseline from the launchpad to the camera and 
the angular height of the rocket as determined from analysis of the 
photo. Comment on the error incurred by the inability to snap the 
photo at exactly the right moment. > Solution, p. 227 


s3 Joe sells square sheets of gold foil. Since gold is expensive, 
the sheets are sold by area a. If the area is too small, the customer 
gets upset, but if the area is too high, Joe is losing money. Therefore 
he wants to make sure that the area doesn’t differ from a by more 
than Aa. In his shop, Joe marks off squares of length x. 

(a) No measurement is perfectly exact. By what amount Az can 
his length measurement be off if the resulting error in the area is to 
be no more than Aa? Use the approximation method described in 
section 1.8.2 on p. 31. Vv 
(b) Check that your answer has the right units, as in example 8 on 
page 28 and section 1.9 on p. 34. 

(c) If the desired area is a = 4.000 m?, and the maximum allowable 
error in area is 0.001 m?, what is the biggest error Joe can afford to 
make when he marks off the length x? Express your result using an 
appropriate unit or in scientific notation, not as an awkward decimal 
with a string of zeroes. v 


t1 (a) Let y = x?, where the constant and p is a natural number. 
Find the best linear approximation to this function for values of x 
near 1. Vv 
(b) Use the result of part a to approximate the value of 1.0000011%" 
without a calculator. v 
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t2 The role of examples and counterexamples in proofs was 
introduced in box 1.3, p. 20. Sally claims that any function y = x”, 
where n is a natural number, has y/ = 0 at x = 0. To prove this, 
she gives a correct calculation of the derivative of y = x* at x = 0. 
(a) Explain why her proof is incorrect. (b) Disprove her claim by 
giving a counterexample. 


t3 The role of examples and counterexamples in proofs was intro- 
duced in box 1.3, p. 20. The addition rule for the derivative (p. 16) 
tells us that the derivative of a sum is the sum of the derivatives. 
Huy proposes that the same thing holds for multiplication: that the 
derivative of a product is the same as the product of the derivatives. 
Disprove Huy’s proposal by giving a counterexample. 


An informal introduction to the derivative 


Chapter 2 


Limits; techniques of 
differentiation 


In chapter 1 we started computing derivatives simply by appealing 
to a list of geometrically plausible properties (section 1.2.3, p. 16). 
These properties are true, and by taking them as axioms we were 
able to prove rigorously that, for example, the derivative of x? is 2x 
(section 1.2.4, p. 17). But there are many problems that are messy 
to solve by this limited toolbox of techniques, and many others for 
which we need qualitatively different tools. 


Historically, the way Newton and Leibniz approached the prob- 
lem was as follows. Suppose we want to take the derivative of x? at 
the point P where x = 1. We already know that we can get a good 
numerical approximation to this derivative by taking a second point 
Q, close to P, and evaluating the slope of the line through P and Q. 
(See section 1.8.1, p. 30). Now instead of picking specific numbers, 
let’s just take point Q to lie at x = 1+ dz, where dz is very small. 
Then the slope of the line through P and Q is 


A 
slope of line PQ = x 
x 
_ (1+dz)?-1 
~ (1+dz)-1 
— 2da+ dx? 
7 daz 


Now comes the crucial leap of faith, which mathematicians of later 
centuries began to feel was a little too sketchy. The number dz is 
supposed to be small, and when you square a small number you 
get an even smaller number. Since dz is supposed to be infinitely 
small, dx? should be so small that it’s utterly unimportant, even 
compared to dx. Therefore we throw away the dx? term and find 
that the slope of the tangent line is 2. 


2.1 The definition of the limit 


Starting in the 19th century, mathematicians became less and less 
satisfied with the logical justification for this style of doing calcu- 
lus. The real number system had gradually become defined in a 
standardized way. It became clear that although one could have a 
number system that obeyed the axioms given in section 1.6 (p. 25) 


>Box 2.1 Ideas about 
proof: proof by contradic- 
tion 


The practice of throwing 
away the square of dx shows 
that many mathematicians, for 
over a century, were willing 
to believe in nonzero numbers 
whose squares were zero. That 
contradicts what you learned in 
grade school, but it’s not nec- 
essarily wrong. A proof has 
to be based on certain assump- 
tions (box 1.2, p. 16). Those 
mathematicians simply didn’t 
assume the same list of prop- 
erties that is now standard for 
the real number system (sec- 
tion 1.6, p. 25). 


Let’s use those assumptions 
to prove that we can’t have a 
nonzero x such that 2? = 0. 
Suppose that such an x did ex- 
ist. Then since x 4 0, by the 
multiplicative inverse property 
there is a number 1/z. Taking 
both sides of x? = 0 and mul- 
tiplying by 1/x gives x?/x = 
0/x, or x = 0. But this contra- 
dicts the original claim that x 
was nonzero. 


This is a proof by contra- 
diction. If we assume some- 
thing is true, and can then, 
through valid reasoning, arrive 
at mutually contradictory re- 
sults, then the initial assump- 
tion must have been false. 
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x f(x) 
3.000000 | 0.600000 
2.500000 | 0.555556 
2.100000 | 0.512195 
2.010000 | 0.501247 
2.001000 | 0.500125 

Example 2. 
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and that included infinitely small numbers,! such a system would 
not be the same as the real numbers. Furthermore one would have 
a problem with the procedure of treating a dx? as if it were zero; 
one can prove from those axioms that zero itself is the only number 
whose square is zero (box 2.1, p. 47). For these reasons, mathemati- 
cians turned to a different way of defining the derivative, by using 
the new notion of a limit. 


2.1.1. An informal definition 


While it is easy to define precisely in a few words what a square 
root is (,/a is the positive number whose square is a) the definition 
of the limit of a function runs over several terse lines, and most 
people don’t find it very enlightening when they first see it. So we 
postpone this momentarily and start by building up our intuition. 


Definition of limit (first attempt) 
If f is some function then 

line) ae 

ra 
is read “the limit of f(x) as x approaches a is L.” It means 
that if you choose values of « which are close but not equal to 
a, then f(x) will be close to the value L; moreover, f(x) gets 
closer and closer to L as x gets closer and closer to a. 


The following alternative notation is sometimes used 
f(z) 39 DL as «74; 


(read “f(a) approaches L as x approaches a” or “f(a) goes to L is 
x goes to a”.) 


Example 1 
If f(x) = x +3 then 
lim f(x) = 7, 


x4 


is true, because if you substitute numbers x close to 4 in f(x) = 
X +3 the result will be close to 7. 





Substituting numbers to guess a limit Example 2 
What (if anything) is 

j xP 2x, 

ees 


Here f(x) = (x? — 2x) /(x? — 4) and a= 2. 
We first try to substitute x = 2, but this leads to 


222.2 0 
A oear 6 

which does not exist. Next we try to substitute values of x close 

but not equal to 2. The table suggests that f(x) approaches 0.5. 





'For more on this topic, see section 2.9 on p. 64. 
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Substituting numbers can suggest the wrong answer. Example 3 
Our first definition of “limit” was not very precise, because it said 
“x close to a,” but how close is close enough? Suppose we had 
taken the function 


5a 101 000x 
100 000x + 1 

and we had asked for the limit limy_,9 g(x). Then substitution of 

some “small values of x,” as shown in the table, could lead us 

to believe that the limit was 1. Only when you substitute even 

smaller values do you find that the limit is zero! 


2.1.2 The formal, authoritative definition of the limit 


The informal description of the limit uses phrases like “closer 
and closer” and “really very small.” In the end we don’t really 
know what they mean, although they are suggestive. Fortunately 
there is a better definition, i.e. one which is unambiguous and can 
be used to settle any dispute about the question of whether or not 
lim,+q f(x) equals some number L. 


Definition of the limit 
We say that L is the limit of f(x) as x — a, if the following two 
conditions hold: 


1. The function f(x) need not be defined at x = a, but it must 
be defined for all other xz in some interval which contains a. 


2. For every ¢ > 0 there exists a 6 > 0 such that for all values of 
x in the domain of f with |x —a| < 6, we have |f(xz)-—L| <e. 


(The Greek letter “” is lowercase delta, equivalent to the Latin “d,” 
and “e” is epsilon, which is like Latin “e.”) 


Why the absolute values? The quantity |” — y| is the distance 
between the points x and y on the number line, and one can measure 
how close x is to y by calculating |a — y|. The inequality |a—y| <6 
says that “the distance between x and y is less than 6,” or that “x 
and y are closer than 6.” 


What are ¢ and 6? The quantity € is how close you would like 
f(x) to be to its limit LZ; the quantity 6 is how close you have to 
choose x to a to achieve this. To prove that lim,,, f(z) = L you 
must assume that someone has given you an unknown « > 0, and 
then find a positive 6 for which x values that close to a result in 
values of f that lie with the range the person has demanded. The 6 
you find will depend on e«. 
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x g(x) 
1.000000 | 1.009990 
0.500000 | 1.009980 
0.100000 | 1.009899 
0.010000 | 1.008991 
0.001000 | 1.000000 

Example 3. 
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a/The value of e« is imposed 
on us. We have succeeded in 
finding a value of 5 small enough 
so that the outputs of the function 
do lie within the desired range. If 
we can do this for every value of 
e, then the limit is L. 
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Example 4 
> Show that limy_,52x +1= 11. 


> We have f(x) = 2x +1, a= 5 and L = 11, and the question we 
must answer is “how close should x be to 5 if want to be sure that 
f(x) = 2x +1 differs less than e from L = 11?” 


To figure this out we try to get an idea of how big |f(x) — L| is: 
|f(x) — L| = |(2x + 1) — 11] = |2x — 10] = 2- |x — 5] =2- |x - al. 
So, if 2|x — a] < e then we have |f(x) — L| < ¢, i.e. 

if |x — al < fe then |f(x) —L| <e. 


We can therefore choose 4 = Be. No matter what « > O we are 
given our 5 will also be positive, and if |x — 5| < 5 then we can 
guarantee |(2x +1) —11| < e. That shows that limy_,52x+1=11. 


Discussion question 


A Figure a on p. 49 shows an example where 5 is small enough for 
the given value of e«. What would the figure look like in a case where the 
value of 5 was not small enough? 


B___~Proof by contradiction was introduced in box 2.1 on p. 47. It can be 
considered as a specific mathematical version of an ancient technique of 
argument called reductio ad absurdum, or reduction to asburdity, which 
means to disprove something by showing that if it were true, then one 
could arrive at ridiculous results. When we say, “if that’s true, then the 
Pope’s not Catholic,” we’re implying that we could give a reductio ad ab- 
surdum. Suppose that Johnny insists on the obvious axiomatic truths (1) 
that monsters live under beds and inside closets; and (2) that monsters 
come out of their hiding places when the lights are turned out. Johnny 
doesn’t want to get eaten by a monster, and has therefore been sleeping 
with the lights on ever since he can remember. Taking Johnny’s axioms 
as valid assumptions, convince him using a reductio ad absurdum that 
monsters do not eat little boys. 





2.2. The definition of the derivative 


& 


The single most important application of the limit is that it gives 
us a way to formalize the idea of a derivative, which we have so 
far been using on an informal basis. We start from the Newton- 
Leibniz approach described on p. 47, but modify it by using a limit 
to get rid of the questionable procedure of discarding the square of 
an infinitesimal number. 


Definition of the derivative 
The derivative of a function f at a point x is 


fet Ae) = fa), 


Ax->0 Ag 





Limits; techniques of differentiation 


The derivative of x*, using limits Example 5 
Let’s use the definition to find the derivative of x2 at x = 1. We 
have 


We've already shown in example 4 on p. 50 that this sort of limit 
of a linear function is just what you would expect by plugging in to 
the equation of the line, and therefore we have f’(1) = 2. 


The derivative of an exponential function, with limits Example 6 
In example 3 on p. 19, we inferred using a simple geometrical 
trick that the derivative of an exponential function like f(x) = 2% 
must be proportional to f itself, 


P= kt, 


where the constant of proportionality k depends on the base, 
such as 2. We can now prove the same fact using limits, and 
say something about the value of the constant. Since this fact is 
supposed to hold for all values of x, and k is to be the same for 
any x, we can pick any convenient value for x, say x = 0. For the 
derivative we have 








90+Ax = 90 
/ _— 
Ee ale Ax 
2k 4 
= ii 
pean AX 
Since f(0) = 1, we have 
AX { 
k= li 
6 Ax 


We can get as good an approximation to this limit as we like by 
plugging in small enough values of Ax. For example, Ax = 10~* 
gives k ~ 0.69317, which seems to be an approximation to In2 = 
0.69314... This naturally leads us to conjecture that the deriva- 
tive of b* equals (In b)b*, and in particular that the derivative of e* 
is simply e*. This is investigated further in section 5.2, p. 126. 


If the limit referred to in the definition of the derivative is unde- 
fined at a certain x, then the derivative is undefined there, and we 
say that f is not differentiable at x. Differentiability is discussed in 
more detail in section 2.8, p. 61. 


We seldom evaluate a derivative by directly applying its defini- 
tion as a limit. Instead, we use a variety of other more convenient 
rules that follow from the definition. Some of these are the prop- 
erties in section 1.2.3, p. 16. In addition, we will learn two very 
important and useful rules, the product rule and the chain rule. 





b/A geometrical _interpreta- 
tion of the expression 2Ax + Ax? 
occurring in the second line of 
example 5. The area gained by 
increasing the size of the square 
equals the area of the two thin 
strips plus the area of the small 
square. 
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c/A geometrical _ interpreta- 
tion of the product rule. 


2.3 The product rule 


The idea behind the product rule is very similar to the geometrical 
intuition expressed by figure b on p. 51 for the derivative of 2°. 
Suppose that instead of x multiplied by x to make 2”, we have some 
other function such as (x? + 7)(x°), which is also the product of two 
factors. Call these factors u(x) and v(x), so that the function we’re 
differentiating is f(x) = u(x)v(x). Then the expression we get by 
applying the definition of the derivative to f can be written in terms 
of the rectangular areas in figure c as 





ight stri : ti 
f'(a) _ lim (right strip) + (top strip) + (tiny box) 
«2-0 Ax 


One can prove from the definition of the limit that the limit of a 
sum is equal to the sum of the limits, provided that the individual 
limits exist (see section 4.1, p. 95, property P3), so: 


ight strip) 
/ — l (rig 

#2) = jim SENS) 
be (top strip) 
Az—0 Ax 


If the functions u and v are both well-behaved at x (specifically, 
if both of them are differentiable), then the “tiny box” term will 
vanish upon application of the limit just as in example 5. We then 
have 
‘ _ (right strip) | |, (top strip) 
Dea Re 0 ee ZAR 
= u'(x)v(x) + u'(x)u(2). 





We have the following extremely important and useful rule for dif- 
ferentiation: 


Product rule 
Let f = uv, where f, u, and v are all functions. Then at any point 
where u and v are both differentiable, 


f=vus+u'4. 


The product rule for x° Example 7 
So far we have never actually proved any derivatives of powers of 
x other than x?; although the proofs can be done by the methods 
of ch. 1, they are tedious. These results come out much more 
easily by applying the product rule. We have already proved that 
the derivative of x* was 2x. To get the derivative of x°, we can 
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simply rewrite it as the product (x°)-(x). Applying the product rule 
then gives 


= 3x2. 


A dirty trick for finding the derivative of 1 /x Example 8 
How do we differentiate 1/x? We can guess the right result by 
recalling that this expression can also be written as x—'. (Expo- 
nents, including negative ones, will be reviewed more systemati- 
cally in section 2.5, p. 56). If we then assume that the power rule 
(x")' = nx"-! applies to n = —1, then the result should be that the 
derivative of 1/x is —x~?, or —1/x?. 


But that’s only a reasonable guess, not a proof. We can prove it by 
the following dirty trick. Write 1 = (x)(1/x), and then differentiate 
on both sides. The left-hand side is a constant, so its derivative 
is zero. Applying the product rule to the right-hand side, we get 
(x)/(1/x) + (x)(1/x)’, and equating this to zero shows that indeed, 
(1/x) = —1/x?. 


2.4 The chain rule 





2.4.1 Constant rates of change 


In addition to the product rule, the other extremely important 
rule for differentiation is the chain rule. We start with three exam- 
ples that illustrate the idea but don’t require calculus. 


Burning calories Example 9 
> Jane hikes 3 kilometers in an hour, and hiking burns 70 calories? 
per kilometer. At what rate does she burn calories? 


> We let x be the number of hours she’s spent hiking so far, y the 
distance covered, and z the calories spent. Then 





Az Az Ay 
Ax Ay Ax 
70 cal\ (3 kt 
~ ( aa | ( 1 ) 
= 210 cal/hr. 
Clowns on seesaws Example 10 


In figure d, the clown on the left drops by Ax, causing the mid- 
dle clown to go up by Ay. The ratio between these appears to 





?Food calories are actually kilocalories, 1 kcal=1000 cal. 


Section 2.4 


The chain rule 


53 


d/ Example 10. 





e / Example 11. 





f/The chain rule allows us 
to differentiate expressions in 
which functions occur nested in- 
side other functions, like Russian 
dolls. 


be about —3/2 based on the lengths of the two lever arms, as 
determined by the position of the fulcrum. This then causes the 
right-hand clown to drop by Az, where Az/Ay is about —2. The 
result is 


Az _Az Ay 

Ax Ay Ax 

3 

7 ~2)(—5) 
=3. 





WA 
me 
Az )) 
es, 
‘Gear ratios a Example 11 


> Figure e shows a piece of farm equipment containing a train of 
gears with 13, 21, and 42 teeth. If the smallest gear is driven by 
a motor, relate the rate of rotation of the biggest gear to the rate 
of rotation of the motor. 


> Let x, y, and z be the angular positions of the three gears. Then 


Az Az Ay 

Ax Ay Ax 
13 21 
~ 21 42 
13 


These examples all used the following relationship among three 
rates of change: 





Az z Az Ay (1) 
Ac Ay Az 
Because the rates of change were stated to be constant, it was valid 
to measure them with expressions of the form A.../A..., and be- 


cause the deltas were real numbers, it was valid to use the normal 
rules of algebra and cancel the factors Ay. 
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2.4.2 Varying rates of change 
The Leibniz notation makes it tempting to simply write down 


and believe the following analogous-looking expression involving deriva- 
tives: 


dz dz dy 

dz dy dx 
In problems pl-p3 on p. 43 we verified that this seemed to work. 
But how do we know that this always works with derivatives? If we 
define the Leibniz notation as standing for a limit, then we need to 


show this: rs re x 
: z [4 Bz _ AY 
oa (im, =) (im, xt) (2) 


Rather than giving a formal proof, I’ve briefly sketched in Box 2.2 
the technical issues involved. These work out as our intuition sug- 
gests, and we therefore have: 


The chain rule 
If z is a function of y, and y is a function of x, and if the derivatives 
dz/ dy and dy/ dz exist at a certain point, then at that point, 


dz dz dy 


dz dy dz’ 


The chain rule is extremely useful in evaluating derivatives, be- 
cause many of the expressions we want to differentiate have a struc- 
ture in which a big formula is built out of smaller ones. For example, 
in problem rl on p. 44, we found by numerical approximation that 
the derivative of the function 

1 
1-2’ 
evaluated at x = 0, was about 1.000. The chain rule gives us an easy 
way to get an exact result for any x. The structure of our formula 
is like this: 
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>Box 2.2 A sketch of the 
technical issues behind the 
chain rule 


If all three derivatives in 
equation (2) exist, then the 
equation essentially works be- 
cause the limit of a product is 
the product of a limit (provided 
that the limits exist); this is 
property Ps; of the limit, to be 
discussed more formally in sec- 
tion 4.1, p. 95. There are two 
other technical issues to worry 
about. 


First, equation (1) is not 
true if Ay = 0, because we 
can’t divide by zero, and if the 
derivative of y with respect to 
x happens to be zero some- 
where, then it’s reasonable to 
worry that this might be forced 
upon us for a certain value of 
Az. Although we won’t prove 
it here, this issue doesn’t actu- 
ally cause the chain rule to fail. 


The second issue is that in 
equation (2), two of the lim- 
its involve Ax —> 0, but one 
has Ay > 0. This turns out 
not to be a problem because, 
as discussed in ch. 4, a differ- 
entiable function must be con- 
tinuous (i.e., there are no gaps 
in its graph), and therefore if, 
by assumption, y is differen- 
tiable as a function of x, then 
y is also continuous, and there- 
fore taking Ax — 0 also causes 
Ay > 0. 
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g/Composition of functions is 
like a bucket brigade. (The work- 
ers in the photo are salvaging 
inventory from a warehouse after 
the 2010 earthquake in Haiti.) 


Take my input and 
subtract it from one. 


Divide one by my input. 





h/The function 1/(1 — x) 
can be viewed as a rule for a 
two-step computation in which 
the output of the first computation 
is fed through as the input to the 
second stage. 


Writing the boxes inside the equations is cumbersome, so let’s 
call the big box z and the small one y. Then 


2a Ny and 
y=l1l-—da, 


which are both functions we know how to differentiate: 
dz _9 


—=-y [example 8, p. 53] 
dy 

dy 

Sieh te Sy 

dx 


In life, sometimes our big goals (get married and raise a family) 
break down into smaller sub-goals (buy a ring, find a priest, pla- 
cate the mother of the bride). The chain rule lets us apply this 
divide-and-conquer strategy to differentiation. Since we know how 
to differentiate z with respect to y and y with respect to x, the chain 
rule lets us solve the larger problem of differentiating z with respect 
to x: 
dz dz dy 


Plugging in x = 0, we verify that the derivative is exactly equal to 
1, in agreement with the earlier numerical calculation. 


2.4.3 Composition of functions 


A little more formally, we can view the chain rule as a rule for 
doing calculus on functions that are built by composition of other 
functions. The composition go h of functions g and h means the 
function that takes an input x and gives back an output g(h(2)). 
That is, we take the input 7, stick it into h, take h’s output, put it 
in g, and finally take g’s output. 


The chain rule tells us how to differentiate a function built out of 
such a composition. In terms of this notation, suppose that f(x) = 
g(h(x)). Then the chain rule says that f’(x) = g'(h(x))h'(x). Or, 
in a simpler but more abstract notation, we can write (go h)! = 
(g oh)h’. 


2.5 Review: exponents that aren’t natural 
numbers 


In section 2.6 we will exploit the product and chain rules to prove 
the rule (x”)/ = na"! for all values of n that are nonzero rational 
numbers. As preparation, we review in this section the basic idea of 
exponentiation, and then the interpretation of exponents that aren’t 
natural numbers. 
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2.5.1 Basic ideas 


We can represent repeated multiplication 
2x2x2=8, 
using the notation for exponents, 
23 = 8. 
Because multiplication is associative, 
2x2x2x2x2x2x2=128 
is the same as 


(PRD 22k 2% 2' x 2); 


so 2’ is the same as (23)(2*). In other words, multiplication is the 
same as adding exponents, 


bp? = ott, (3) 


An important special case is scientific notation, which uses powers 
of 10. For example, (107)(10) = 10°. 


2.5.2 Zero as an exponent 


Suppose we compute the list of decreasing powers of a given 
base, for example 2? = 8, 2? = 4, and 2! = 2. Each result is half as 
big as the previous one. Therefore if we want to continue reducing 
the exponent, we should clearly have 2° = 1 in order to continue 
the pattern. In general, b° = 1 for any nonzero base b. (The special 
case 0° is undefined.) 


2.5.3 Negative exponents 


Continuing this pattern, we must have 2~' = 1/2. In general, 
negative exponents indicate the inverse of the corresponding positive 
exponent. 


2.5.4 Fractional exponents 


Our rules for zero and negative exponents were consistent with 
equation (3). We can also define fractional exponents that obey this 
rule. For example, if 3!/2 is a number, then equation (3) requires 
that (3!/2)(3!/2) = 3, so an exponent 1/2 must mean the same thing 
as a square root. 


2.5.5 Irrational exponents 


If we want to define an expression such as 2”, we can take it to 
be the limit of the list of numbers 2?, 22:1, 23:14, 23-141 ||, 


2.6 Proof of the power rule in general 


In section 1.3, p. 20, I presented the rule (x”)! = nx"~! for all 
natural numbers n, but only explicitly proved it for n = 1 and 2. 
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>Box 2.3 Ideas about 
proof: proof by induction 


Proof by induction is a 
technique for proving an infi- 
nite number of facts without 
using infinitely many words. 
Call these facts, or proposi- 
tions, P,, P2, and so on. For 
example P, could be the claim 
that if we kick over the first in 
an infinite chain of dominoes, 
then the nth domino will fall 
as well. Induction requires two 
steps. 


(1) We establish that P, is 
true. For example, if we kick 
over the first domino, then P; 
is clearly true, since kicking it 
over causes it to fall. This is 
called the base case. 


(2) We show that if P,-1 
holds, then P, is true as well. 
For example, if domino n — 1 
falls, then it will cause domino 
n to fall as well. 





i/Proof by induction is like 
an infinite chain of dominoes. If 
we topple the first domino, then 
eventually every domino will fall. 


A good application of the product and chain rules is to extend the 
proof to all nonzero integers n and to show that it also holds for 
fractional exponents. 


Only n = 0 requires special treatment. Since x° = 1, its deriva- 
tive should be zero. Our rule sort of, but not quite, works here, 
since it gives O2~', or 0/a. This is certainly zero if 2 4 0, but in 
the case where x = 0 it gives 0/0, which is undefined. 


2.6.1 Exponents that are natural numbers 


Example 7 on p. 52 showed that the product rule can be used to 
prove special cases of the power rule. Since we knew the derivative 
of x”, we were able to find the derivative of «* by rewriting it as 
(x?)(x) and applying the power rule. In the same way, we can prove 
the rule for any exponent n if it has already been established for 
n—1. We rewrite x” as (2”~')(a), differentiate using the product 
rule, and find: 


(a) = (@™*Y (a2) + @*Y)(a)! 
=(n—1)e"*e¢ +41 
=n} 
By establishing the fact for n = 1, and then proving that it must 
hold for n if it holds for n — 1, we establish that it holds for all 
natural numbers n. This is called proof by induction (box 2.3). 


2.6.2 Negative exponents 


We saw in example 8 on p. 53 that (1/a)! = —1/x?, which was 
exactly what we would have expected from applying the power rule 
to the exponent —1. It is then straightforward to extend the result 
to all negative integers by applying the chain rule to (x”)~!. 
2.6.3 Exponents that aren’t integers 


What about fractional exponents, such as x!/2, ie., the square 
root of x? We don’t know what this derivative is yet, but let’s give 


it a name. Call it f,ie., f(x) = (/z)’. Then 


er 

= (Vevay 

= f(t)Vz + Vzf(z) 

= 2f(x)Vx 

1 
oe 
1 
=5 go l/2 

This is exactly what we would have inferred from the power rule 


(2”) = nx", with n = 1/2. A similar argument can be carried 
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out for any fractional exponent, although recognizing this is not 
quite the same as writing a general proof; a general proof is given 
in example 8, p. 165. The generalization to irrational exponents is 
deferred until example 4 on p. 135. 


Economic order quantity Example 12 
Here is an extremely common problem in the business world. 
A retailer knows that there is a steady yearly demand D for the 
widgets it sells; every year, customers buy D widgets. They need 
to maintain an inventory of the product, and when they run out, 
they need to buy a quantity g from their wholesaler. Ordering from 
the wholesaler costs a certain amount per widget plus a certain 
amount per order, and because of the per-order cost, the retailer 
would prefer that the quantity of widgets q in each order be big. 


The retailer also has to pay a certain amount to store all the wid- 
gets in inventory. For example, if their inventory gets too big, they 
may have to buy or rent a new warehouse. This is a reason not 
to make q too big. 


We have the following model of the retailer's yearly costs: 


C=cD [wholesale cost of the widgets, including shipping] 
D 
+ Cea [D/q=number of orders; co=fixed cost per order] 


+ €3q [cost of storing an inventory of gq widgets] 


We want to minimize the function C(q), taking D, cy, C2, and C3 
as constants. If g is too small, the second term dominates and 
becomes large, while the same happens with the third term if q is 
too big. Therefore we know that the minimum of C must occur at 
some finite value of g. The function is smooth, so this minimum 
must occur at a point where the derivative dC/ dq is zero (section 
1.5.3, p. 24). Writing 1/q as q~' and applying the power rule, the 
derivative is 


and setting this equal to zero gives 


_ [20 
q= Ge 


where only the positive square root has real-world significance. 
This answer makes sense because we respond to greater de- 
mand D by making bigger orders, and likewise if the fixed cost 
per order Co is high, we will make bigger orders in order to reduce 
the number of orders. If the cost cz of warehousing a widget for a 
year is large (e.g., the widget is a jumbo jet), then we will order in 
smaller quantities. 


20 








| aa’ 
5 10 


j/Example 12, with cjD = 1, 
CoD = 9, and c3 = 1. 
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2.7 Quotients 


Suppose that we want to differentiate the function 


1 


x 


The product rule tells us how to differentiate an expression involving 
multiplication, but this one uses division. However, division by a 
certain number is the same as multiplication by its multiplicative 
inverse, so we can rewrite this function in a form that we know how 


to differentiate. 
1 : Pa / 
(4) =@) 


=-g? [power rule] 


If the expression in the denominator is more complicated, we can 
do the same thing, but use the chain rule as well: 


1 / 
(aa) 


If the numerator is not just 1, then we also have to use the product 
rule: 


a \' ' 
(; = =) = (23(1+27)"1) 
= (2°/(1+27) +a? [1+ alae [product rule] 
= $27(1+27)-1 42° [-(1 + a); *(22)| 
gee aS ca 


(ate?) 


= —(1+2?) (22) 





The foregoing examples show a technique for differentiating quo- 
tients that works in all cases, and this is how I do that type of 
derivative. Some people, however, prefer to memorize the following 
rule, which can be proved by running through the steps above for a 
function f = p/q, where p and q can be any functions at all. 


Quotient rule 
Let f = p/q, where f, p, and q are all functions. Then at any 
point where p and q are both differentiable and q 4 0, 


, Pa-dp 
fat 
qd 
In the examples above, the functions p and gq happened to be 


polynomials. A function like f that is formed in this way from the 
quotient of polynomials is called a rational function. 
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2.8 Continuity and differentiability 
2.8.1 Continuity 


Intuitively, a continuous function is one whose graph has no 
sudden jumps in it; the graph is all a single connected piece. Such a 
function can be drawn without picking the pen up off of the paper. 
Formally, continuity is defined as follows. 


A function g is continuous at a if 
lim g(x) = g(a) (4) 


A function is continuous if it is continuous at every a in its domain. 


In most cases, there is no need to invoke the definition explicitly 
in order to check whether a function is continuous. Most of the func- 
tions we work with are defined by putting together simpler functions 
as building blocks. For example, let’s say we’re already convinced 
that the functions defined by g(x) = 3x and h(x) = sinz are both 
continuous.? Then if we encounter the function f(x) = sin(3z), 
we can tell that it’s continuous because its definition corresponds to 
f(x) = h(g(x)). The composition of two continuous functions is also 
continuous. Just watch out for division. The function f(x) = 1/z is 
continuous everywhere except at x = 0, so for example 1/sin(zx) is 
continuous everywhere except at multiples of 7, where the sine has 
zeroes. 


2.8.2 More about differentiability 


We mentioned briefly on p. 51 that a function is defined to be 
differentiable or nondifferentiable at a particular point depending 
on the existence of the limit referred to in the definition of the 
derivative, 





Figure | shows two common reasons why a function would not be 
differentiable at a certain point: because it has a kink, or because 
it is discontinuous. If a function is discontinuous at a given point, 
then it is not differentiable at that point. 


Although differentiability implies continuity, a function can be 
continuous without being differentiable; see example 13. 


We seldom have to resort to limits and epsilon-delta arguments 
in order to determine whether a function is differentiable at a par- 
ticular point. Here are three methods that, when they apply, are 
usually easier: 





3The reader who has forgotten all of his/her trig is directed to the review in 
section 5.3. 





k/A discontinuous — function. 








x1 X2 


1/The function is not differ- 
entiable at x; because it has 
a kink there, and is not differ- 
entiable at x2 because it has a 
sudden jump. 





m/Reflected light forms a 
geometrical curve inside a 
teacup. The curve has a kink 
similar to the one at x, in figure 
|. This kink is of a special type 
called a cusp, in which the two 
branches are parallel where they 
meet. 
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| > x 
-1 1 
n/ Example 13. 

X 
2 3 












t+ X 
10 20 30 


p/ Example 15. 
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1. Graph the function and apply the informal definition of the 
derivative from section 1.2.1, p. 14. That is, imagine trying 
to zoom in on the point of interest until the curve appears 
straight, and then measuring its slope. If something goes 
wrong in this process, then the function isn’t differentiable. 


2. Often we deal with functions that have been defined by a for- 
mula, which means building it out of other functions through 
arithmetic operations and composition. If all of these func- 
tions and operations are differentiable at the point of interest, 
then the function is differentiable. 


3. If the function f has been defined by a formula, then it will 
usually be possible differentiate it using the differentiation 
rules and write the result as a new formula for f’. Often 
there will be only certain specific points where the formula 
for f’ is undefined, so these are the points where f wasn’t 
differentiable. 


The absolute value function Example 13 


> Where is the function y = |x| differentiable? 


> By visualizing the graph, figure n, and applying method 1 we 
can tell immediately that it’s differentiable everywhere except at 
X = 0. At x = 0, there is a kink, and no matter how far we zoom 
in, the kink will never look like a line. 


Not differentiable when dividing by zero Example 14 
> Where is the function f(x) = 1/(x — 1) differentiable? 


> Let's use method 2 above. This function can be built out of 
the composition of functions as f(x) = g(h(x)), where g(x) = 1/x 
and h(x) = x — 1. Both of these functions are well-behaved ev- 
erywhere, except that g isn’t differentiable where it blows up at 
X = 0. Therefore the function f is differentiable everywhere ex- 
cept at x = 1, which is where h(x) = 0 is the input to g(x). 


Differentiability of the cube root Example 15 


> Where is the function y = x'/° differentiable? 


> Let's use method 3. The power rule gives y’ = 3x~?/3. This 
is well defined everywhere except at x = 0, where it blows up to 
infinity. Therefore y is differentiable everywhere except at x = 0. 


Nondifferentiable ingredients, differentiable result Example 16 
Method 2 can prove that a function is differentiable, but cannot 
necessarily be used to prove it nondifferentiable. For example, 
consider the function y = x°(1+1/x). The second factor blows up 
to infinity at x = 0, which makes us suspect that y is not differen- 
tiable there. But in fact the formula can be rewritten as y = x°+x*, 
which is clearly differentiable everywhere. Although the second 
factor in the original form blows up at x = 0, the first factor van- 
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ishes there so rapidly that the product also vanishes, and van- 
ishes smoothly. 


2.8.3 Zero derivative at the extremum of a differentiable 
function 


We saw in section 1.5.3, p. 24, that although a searching for a 
zero derivative may be a good way to find an extremum, it doesn’t 
always work. Looking at the zoo of possibilities in figure q, we see 
that both of the following statements are false: 


1. If a function has a local extremum, it must have a zero deriva- 
tive there. (False: fails at A, E, and F.) 


2. If a function has a zero derivative somewhere, that must be a 
local extremum. (False: fails at H.) 


In mathematical jargon, we say that a zero derivative is neither a 
necessary (1) nor a sufficient (2) condition for a local extremum. 


We can, however, make a more restricted statement of 1 that is 
true. 


Theorem 

If a function f is continuous on an interval [a, b] and differen- 
tiable on (a,b), and if there is a point c € (a,b) for which f(c) 
is a maximum or minimum in the interval, then f’(c) = 0. 


Let’s see why all the conditions are necessary. The assumption 
of continuity is needed because of points like E. We need differen- 
tiability because of F. We also needed to assume that c was on the 
interior of the interval, since otherwise it would have been possible 
to choose b so that point E lay at x = 6. 


Proof: We prove the case where f(c) is a maximum, as in figure 
q; the other case is exactly analogous. Since f is assumed to be 
differentiable, it’s differentiable at c, and since c is on the interior 
of the interval, differentiability means that the derivative must have 
the same value regardless of whether we approach c from the right 
or from the left. (At a nondifferentiable point such as F, the two 
limits could be unequal.) Let’s look at both of these limits. The 
limit from the left is 





him Leth) = Flo) 
h 70 h 


But since we assumed f(c) to be the greatest value on [a,b], the 
quantity inside the limit is guaranteed to be greater than or equal 
to zero. The limit exists, since we assumed differentiability, so the 
limit must also be greater than or equal to zero. Similarly, the limit 
from the right 








a 


q/\f point D 


is a maximum 


over the interval [a,b], then f’ 


equals zero at D. 
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must exist and be less than or equal to zero. Since the two limits 
are equal, they equal zero. 














2.9 Safe handling of dy and dx 


We’ve seen that although the real number system doesn’t include 
infinitely big or infinitely small quantities, it can nevertheless be 
extremely useful to think of a notation like dy/ dz as the quotient of 
two infinitely small numbers. For example, it allows us to check our 
work in differentiation by checking the units of the result (example 
8, p. 28), and it makes the chain rule look so obvious that there 
would never be any danger of forgetting it. When the calculus was 
first invented, these infinitely small numbers were referred to as 
infinitesimal numbers. The idea behind the word is that just as a 
decimal is one tenth, an infinitesimal is one “infinitieth.” 


We now confront the question of when it’s safe to treat dy and 
dx as if they were numbers. This kind of manipulation is like nuclear 
energy: it can be used for good and for evil, and if you want to use it 
safely, you have to know what you’re doing. In this section we lay out 
some simple safety rules which, if followed, will prevent all nuclear 
meltdowns. Just as we enriched the set of natural numbers to make 
the rational numbers, and the rational numbers to make the reals, 
we continue the march of progress by making an even larger number 
system called the hyperreal numbers, which includes infinitesimals. 
For a more detailed exposition at the freshman-calculus level, see 
the excellent free online book by Keisler, Elementary Calculus: An 
Approach Using Infinitesimals. 


We start with two preliminary definitions. 


Definition: Suppose that for a certain nonzero number d, we 
have |d| < 1, |d| < 1/(1 +1), |d| < 1/4 +141), ...and so on for 
all inequalities of this form. Then we say that d is infinitesimal. 


Definition: Let H be a hyperreal number (which may or may 
not also be a real number). Suppose that there exists some real 
number r such that |H —r| is infinitesimal. Then we say that r is 
the standard part of H. 


Rule 1. The hyperreal numbers obey all the same elementary 
axioms as the real numbers (section 1.6, p. 25). 


The hyperreals numbers include at least one infinitesimal num- 
ber, call it d. By rule 1, we can apply the multiplicative inverse 
axiom to d, so 1/d is also a well-defined hyperreal number, and 
clearly 1/d is bigger than 1, bigger than 1 + 1, and so on, so the 
hyperreal number system includes both infinitely big and infinitely 
small quantities. 





“Cf. example 11, p. 113. For an application to economics, see rule 3, p. 218. 
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It can be proved from the elementary axioms that if d is nonzero, 
then 2d # d. Therefore the hyperreal number system includes a 
variety of sizes of infinitesimals. This is important, because if all 
infinitesimals were the same size, then dy/ dx would always have to 
equal one! It also follows from the axioms that 1/d 4 1/(2d), so 
infinite numbers come in different sizes as well. We therefore have: 


Rule 2. The symbol oo and the term “infinity” do not stand for 
any real number, and do not stand for any specific hyperreal number. 
They are in fact not very useful in the context of the hyperreals. 


Breaking the rules gives a nuclear meltdown Example 17 
Suppose that the universe is infinite, so that there are infinitely 
many animals in the universe that, like us, have two eyes. The 
number of left eyes is some infinite hyperreal number H, and H is 
also the number of right eyes. The total number of eyes is then 


H+H=2H. 


Everything is all right, and 2H is an infinite number that happens 
to be twice as big as H. 


But now suppose we break rule 2 and use the symbol oo indis- 
criminately for any positive, infinite quantity. Then we have 


WO+Fwo=H. 


Applying the additive inverse axiom, we can cancel an co from 
each side, giving 

coo =0, 
which is absurd. 


The paradox didn’t result from talking about infinite numbers. It 
came from breaking one of the rules for manipulating them cor- 
rectly. 


Historically, one of the main sources of confusion about infinitesi- 
mals was the sketchy practice of discarding the square of an infinites- 
imal (p. 47). This is resolved as follows: 


Rule 3. The derivative of y with respect to x is defined as the 
standard part of dy/ dz. 


Redoing the example from p. 47 according to this rule, we have 
the following calculation of the derivative of y = x? at x = 1: 


dy  (1+dz)*-1 
dz (1+dx)-1 
= 2dr 
y = standard part of 2+ dx 


=2 





Although this particular modern approach to calculus makes dy/ dx 
not a synonym for y’, the notational distinction is not assumed in 
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>Box 2.4 Why 0! equals 1 


We define 0! = 1, both be- 
cause it turns out to be more 
convenient in all of our appli- 
cations, and for the following 
logical reason. 


In the more usual case 
where n > 1, n! is defined as 
a product containing n factors. 
If we start with a rubber band, 
then stretch it successively by 
all of these factors, we end up 
stretching it by a factor of n! 
over all. 


In the case of n = O, we 
have no factors in our list, so 
we have nothing on our list 
of things to do to the rubber 
band. It is left at its original 
length. It has been stretched 
by a factor of 1, i.e., left alone. 


Note that exactly the same 
logic applies to exponents, and 
that’s why we also define, for 
example, 7° = 1. 
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a general context, since they were thought of as synonyms for hun- 
dreds of years. 


Ideas very much like rules 1 and 3 were in fact originally pro- 
posed by Leibniz,’ but not until the 1960s were they restated pre- 
cisely enough to satisfy the mathematical community. In the in- 
terim, there was considerable suspicion of infinitesimals (Georg Can- 
tor famously referred to them as “infect/ing] mathematics” like a 
“cholera-bacillus” ), and today many mathematicians dislike them, 
despite their logical rehabilitation, as a matter of taste. 


A not-quite proof of the chain rule 
The Leibniz notation for the chain rule 


G2 820) 
dx dy dx 


Example 18 


makes it look as though its proof were a matter of trivial algebra: 
just cancel the factors of dy. This isn’t quite valid, however, as a 
rigorous proof, because the derivative is really not the quotient of 
two infinitesimals but the standard part of that quotient. 


A calculator for infinite and infinitesimal numbers | Example 19 
A web-based calculator at lightandmatter.com/calc/inf lets 
you play with infinite and infinitesimal numbers. It provides one 
built-in infinitesimal number d that satisfies the definition on p. 64. 
The following example shows some sample calculations. 


2+2 
4 
d+d 
ed 
d<1/1000 
true 
d>0 
true 


2.10 The factorial 


In a number of places in this course, it will be helpful to know 
about a function called the factorial. The factorial of n, notated n!, 
is defined as the product of all the integers from 1 to n, 


nmi=1-2...n. 


For example, 3!, read as “three factorial,” is 1-2-3 = 6. As a special 
case, we define 0! to be 1 (not zero), for the reasons given in Box 
2.4. 





>Blaszczyk, Katz, and Sherry, “Ten misconceptions from the history of anal- 
ysis and their debunking,” arxiv.org/abs/1202.4153. 
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2.11 Style 


Style is important. If you say true things in poor style, people will decide that you’re stupid and 
ignore you. You know enough calculus to appreciate some examples. 


1. Use equals signs. State what it is that you’re calculating. 


wrong 
3x(a 4+ 4) 
3a? + 122 
62+ 12 


2. The Leibniz notations d and d/ dz are operations 


Question: Differentiate x. 
Wrong answer: d/ = 2x 
Wrong answer: d/ dx = 2x 


3. Immediately make obvious simplifications. 


wrong 
(x? + 3)! 
= 221+0 


right 

[Ba(a + 4) 

= (Bx? + 12]! 
=6x+12 


(like ,/~), not numbers. 


Question: Differentiate x?. 
Right answer: d(a#?)/da = 2x 
Right answer: d(...)/dx = 2x 


right 

(x? + 3)! 
= 271+0 
= 2 


[or don’t write this at all] 


4. Simplification should usually reduce the number of symbols. 


wrong 
[(a? + 1)°) 
= 3(2? + 1)?(2zr) 


= 3(x* + 2x? + 1) (22) [uglification] 


wrong 


[1/V1 +2] 


= 2(1+2)/1te [uglificat ion] 


5. Don’t use a complicated technique when a simple 


Oe GA rly 
a22+1)2 
_ @2+1)-(22) 





[quotient rule] 








right 
a el 
= 3(x? + 1)?(22r) 


= 6x(x7 +1)? [simplification] 


right 
[1/V1 + a)’ 
= [0 +2)-7) 


=-3(1+ ey oe [Stop here.] 


one will do. 


right 
gd [known fact] 


[power and chain rules] 
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Review problems 


al Compute 37377, v 


a2 Compare u = 107!°" with v = 19710", (Note that expo- 
nentiation is not associative, and an expression of the form a” is 


interpreted as al). 


a3 Solve 16” = 1/2 for x. v 
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Problems 


Example 2 on p. 48 demonstrates a way of guessing a limit by plug- 
ging in numbers and making a table of values. Do the same thing 
in problems b1-b8. 


b1 


te 
lim (x — 2) tan 5 


b2 


z0 ,/1 — cosx 


(As always in this course, trig functions are assumed to take angles 
in radians. Put your calculator in radian mode.) 


b3 


lim x7 e7"Vle! 
x20 


In example 5 on p. 51 we found the derivative of the function y(x) = 
x? by directly applying the definition of the derivative as a limit. In 
problems c1-c4, apply the same brute-force technique to the given 
functions. 

cl u(a) =a? ata=1 

c2 0 p(j) =F at j=l 


c3 t(c) =4 atc=1 


c4 s(n) =~ atn=1 


el Differentiate ~/x with respect to x. > Solution, p. 227 


e2 Differentiate the following with respect to 2: 





[Thompson, 1919] b> Solution, p. 227 
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e3 The following table shows the barometric pressure P and 
average July temperature T' for the summit of Mount Everest and 
the city of Wenzhou, China, which is at the same latitude. 


pressure (kPa) temperature (°C) 
Wenzhou 101 +29 
Everest 38 —16 


A physical model predicts the following relationship between these 
two variables: 
T=T)+cP*/" 


Here c is a constant and JT, = —273°C is a constant that converts 
from degrees Celsius to a temperature scale based on absolute zero. 
(a) Estimate c from the data at Wenzhou. v 
(b) T is a complicated nonlinear function of P, and for some pur- 
poses, such as mental estimation, a linear approximation might be 
more convenient to work with. Find the equation of the tangent line 
to this function at the point representing the conditions at Wenzhou, 
and use this equation to calculate the expected temperature at the 
summit of Everest. This is quite a long extrapolation. How good 
an approximation is it? Vv 


e4 Use the product rule to prove the vertical stretch property 
of the derivative (p. 16). > Solution, p. 227 


In problems g1 and g2, compute each derivative by two different 
methods: (a) by multiplying out the given expression and then dif- 
ferentiating, and (b) by using the product rule. Make sure that you 
get the same answer by both methods. 


gl y= (22? +24 1)yz. Vv 


g2 y=(x+5)(x? +1). v 
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il (a) Consider the function f(x) = xe*, where e is the base 
of natural logarithms. Use the technique described in section 1.8.1, 
p. 30, to find f’(1), to three decimal places of precision. Vv 
(b) In example 6, p. 51, we conjectured that the derivative of e” was 
simply e”. This is discussed in greater detail in ch. 5, but for now 
let’s just assume that it’s true. Given this fact, use the product rule 
to differentiate the function f. Check that the result is consistent 
with your answer to part a. v 


i2 We’ve established the power rule using limits, which are the 
most common modern tool for defining derivatives. By this rule, 
the derivative of x? is 3x7, and evaluating this at 2 = 1 gives a 
derivative of 3. 


Chapter 2 began by showing a more old-fashioned technique for 

differentiating 2? at 2 = 1 (p. 47). Apply this technique to x? 

at x = 1, and show that it agrees with the result found above. 
> Solution, p. 228 


i3 Differentiate (2x + 3)!°° with respect to x. 
> Solution, p. 228 


i4 Differentiate (a + 1)'°°(a + 2)? with respect to 2. 
> Solution, p. 228 


i5 Use the chain rule to differentiate ((x)?)?, and show that 
you get the same result you would have obtained by differentiating 
x, [M. Livshits] > Solution, p. 228 


i6 In section 2.4.3 on p. 56, we expressed the chain rule without 
the Leibniz notation, writing a function f defined by f(x) = g(h(a)). 
Suppose that you’re trying to remember the rule, and two of the 
possibilities that come to mind are f’(r) = g’(h(x)) and f’(x) = 
g (h(ax))h(x). Show that neither of these can possibly be right, by 
considering the case where x has units. You may find it helpful to 
convert both expressions back into the Leibniz notation. 
> Solution, p. 228 
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Compute the derivative of each of the functions in problems j1 and 72 
by two different methods: (a) by multiplying out the given expression 
and then differentiating, and (b) by using the chain rule. Make sure 
that you get the same answer by both methods. 


jl y= (42?) v 


j2 y = (a? +241) Vv 


In problems k1-k7, differentiate the given function, and try to sim- 
plify your answer as much as possible. 


kl c(d)=d+1+(d+1) Vv 
k2 
b-—2 
a(b) 7 ae 
Vv 
k3 








Vv 
k4 h(z) = Vv1—- 2 Vv 
k5 
at +6 
h(t) = eer (a, b, c, and d are constants.) 
Vv 
k6 
1 
p(c) = (1+) 
Vv 
k7 
m 
= aaa 
Vv 
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In problems mi-m1, j, k, 2, and m, are constants. Calculate the 
given derivatives. Simplify answers where possible. 


m1 a [(¢s? + ks)™] (where j 40 and m # 0) v 
d v 
ay eee Vv 
me! a (5, :) 


m3 < [(ew + m)V jo + B| J 


d/ e 
9 ie V 
a (a=) 


nl Suppose that we put a stick on a table and use a ruler 
to measure its length DL. According to Einstein’s theory of special 
relativity, if the stick is instead in motion at speed v relative to the 
ruler, then we get a different, shorter length given by 


2 
M=Li1-—, 
Cc 


where c is the speed of light. We don’t notice this effect in every- 
day life because ordinary velocities are so small compared to c. (a) 
Calculate dM/ dv, the rate at which the stick shortens with increas- 
ing speed. (b) Check the units of your answer (section 1.9, p. 34). 
(c) Check that the sign of the result makes sense. (d) Discuss the 
behavior of your result if v = c. Vv 
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The function of problem p14, 


with a= 3, b=1, and =1. 
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n2 Suppose that a distant galaxy is moving away from us at 
some fraction u of the speed of light. Then the vibration of the light 
waves we receive from it is slowed down by the factor 


compared to what we would have observed if it hadn’t been in mo- 
tion relative to us. This is called the Doppler effect. Compute the 
derivative dD/ du, which measures how sensitive the effect is to the 
velocity. Vv 





pl When you tune in a radio station using an old-fashioned 
rotating dial you don’t have to be exactly tuned in to the right fre- 
quency in order to get the station. If you did, the tuning would be 
infinitely sensitive, and you’d never be able to receive any signal at 
all! Instead, the tuning has a certain amount of “slop” intention- 
ally designed into it. The strength of the received signal s can be 
expressed in terms of the dial’s setting f by a function of the form 
‘ 1 
Vara dg rer 
where a, b, and f, are constants. The constant 6 relates to the 
amount of slop. This functional form is in fact very general, and 
is encountered in many other physical contexts. The graph shows 
an example of the kind of bell-shaped that results curve. Find the 
frequency f at which the maximum response occurs, and show that 
if b is small, the maximum occurs close to, but not exactly at, fo. 
> Solution, p. 229 








p2 Many cactuses are approximately cylindrical in shape. In 
order to minimize the loss of water through evaporation, it is ad- 
vantageous for a cactus to have a minimum surface area for a given 
volume. Find the proportion of height to diameter that achieves 
this, taking the cactus to be a cylinder with only its top and sides 
exposed. Vv 


Limits; techniques of differentiation 


p3 An atomic nucleus is made out of protons and neutrons. The 
number of protons is called Z and the number of neutrons N. Figure 
s on p. 76 shows a chart of all of the nuclei that have been observed 
and studied to date. Most of these are unstable: they undergo 
radioactive decay in a certain amount of time, and therefore are not 
found in the earth’s crust, so they can only be produced artificially. 


The stable nuclei are shown on the chart as black squares, and we 
can see that they follow a certain curve. Unstable nuclei that lie 
below and to the right of the line of stability have too many neutrons 
in proportion to their protons, and they undergo a decay process in 
which a neutron is converted to a proton, causing the nucleus to 
move one step diagonally on the chart, as in the game of checkers. 
Similarly, nuclei with too few neutrons move by diagonal steps down 
and to the right. Defining A = N + Z, these decay processes keep 
A constant. 


In the liquid drop model, the nucleus is treated as a continuous fluid 
with certain properties such as surface tension. Since the fluid is 
continuous, we can pretend that N and Z are capable of taking on 
any real-number values. (This is similar to the water molecules in 
the reservoir on p. 14.) In this model, a nucleus has a certain energy, 

B= p72 a 4 AW 22) ea 


o) 


where b & 0.031, and for simplicity we have left out an over-all 
constant of proportionality with units of energy. Let’s consider E 
as a function of Z, and A as a constant. Since radioactive decay 
requires the release of energy, and our radioactive decay processes 
keep A constant, a nucleus will be stable if it has the value of Z that 
minimizes the function E(Z). 


(a) Find this stable value of Z, in terms of A and b. Vv 
(b) For light nuclei, we observe that the stable nuclei have about 
half protons and half neutrons. Verify this from your answer to part 
a. 

(c) The heaviest nucleus shown as a black square on the chart is a 
uranium nucleus with Z = 92 and A = 238. Verify that your answer 
to part a passes close to this point. 
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Problem p3. 
rl One car is driving north, along the y axis, so that at time t 
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its y coordinate is y = t. Another car is driving west, along the x 
axis, with x coordinate « = 1 —t. Initially, at t = 0, the second car 
is aimed straight at the first one. 

(a) Use the Pythagorean theorem to find the function r(t) giving 
the distance r between the two cars at time t. Eliminate x and y 
from your expression by using the equations above, so that it only 
has ¢ in it. v 
(b) Find the time at which the distance is at a minimum. (You may 
find it helpful to employ the shortcut demonstrated in the solution 
to problem p1.) Vv 


Limits; techniques of differentiation 


r2 A fancy factory can’t produce anything if it has no workers to 
keep it running, but on the other hand a big crowd of workers stand- 
ing around in a vacant lot also can’t do anything. Businesses need 
to balance their spending on labor LZ and the amount EF invested in 
capital equipment, such as machinery. In 1928, economists Charles 
Cobb and Paul Douglas used macroeconomic data from the U.S. to 
come up with the following model for production. 


PS Cree 


Here P is the amount produced, and c and qa are constants. Suppose 
that a business has a fixed amount of capital T’, so that 


L+E=T. 


(a) Use the second equation to eliminate E, and find the optimal 
fraction L/T of capital that should be spent on labor. (b) Show that 
your answer has the correct behavior in the special cases a = 0, 1/2, 
and 1. 


r3 A slice of pie subtending an angle @ (in radians) is cut from 
a pie of radius r. (You may wish to review the definition of radian 
measure, section 5.3.1, p. 128.) 

(a) Find the perimeter P of the slice, i.e., the sum of the lengths of 
its two straight sides plus the arc length of the curved side. Vv 
(b) Find the area A of the slice. v 
(c) Suppose we want to make a pie-slice shape with the minimum 
possible perimeter for a fixed area. (The radius r is not fixed.) 
Use your answer to part b to eliminate r from part a, and find the 


perimeter as a function of A and 0. Vv 
(d) Find the value of 6 that minimizes the perimeter, treating A as 
a constant. Vv 
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r4 A camera takes light from an object and forms an image 
on the film or computer chip at the back of the camera inside its 
body. Let u be the distance from the object to the lens, and v the 
distance from the lens to the image. These distances are related by 
the equation 


f uv 
where f is a fixed property of the lens, called its focal length. When 
we want to focus on an object at a particular distance, we have to 
move the lens in or out so that u and v fulfill this equation; in an 
autofocus camera this is done automatically by a small motor. Let 


L=u+v 


be the distance from the object to the back of the camera’s body, 
and suppose that we want to take a picture of an object as nearby 
as possible, in the sense of minimizing L. 

(a) Solve the first equation for v, and substitute into the second 
equation to eliminate v, thereby expressing DL as a function that 


depends only on the variable u (and the constant f). v 
(b) Find the value of u that minimizes the function L(w). 
(c) Find the minimum value of L. Vv 


Problems t1-t7 can be done using methods 1-8 on p. 62. 


t1 Sketch the graph of the function 
1 
x) = ——_ 
LOS aaa 
by plotting a few points, including ones where zx is negative, zero, 
and positive. Is f differentiable at « = 0? > Solution, p. 230 


t2 Let the function f be defined as f(a) = 1/sin x, where the sine 
function takes its argument in radians. Where is f discontinuous? 
Where is it nondifferentiable? You do not have to evaluate the 
derivative in order to answer this question, but you do need to recall 
basic properties of the sine function. If you’ve forgotten your trig, 
you may need to look at the review in section 5.3, p. 128. 

> Solution, p. 230 


t3 A cusp is a special type of kink, in which the two branches are 

parallel where they meet. An example is shown in figure m on p. 61. 

For which values of the exponent p does the function f(x) = |a|? 

have a cusp at x = 0? For which values is it nondifferentiable? 
> Solution, p. 230 


t4 List any nondifferentiable points of the following functions. 
f@)=@- 18" -@41)" 
ae 2 leas 
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t5 List any nondifferentiable points of the function 


h(x) = V2? 4+ 24. 





t6 Find any nondifferentiable points of the function 
. 1 
j(z) ~~ 2 _—7 

t7 Determine the domain of the function 


U(x) = a7 Vz, 


and locate any nondifferentiable points in its domain. 


ul A certain line has the following properties: (1) It passes 
through the point (0, —c), where c is a positive constant. (2) Its slope 
is positive. (3) It is a tangent line to the parabola y = 2”. Find the 
slope of the line. Check that your result makes sense in the special 
case c = 0, that it shows the correct trend as c grows, and that it 
does something appropriately nasty if, contrary to assumption, c is 
negative. Vv 


u2 A line passes through the point (0,1), and is also tangent 
to the curve y = cx?, where c is a constant. Find the x coordinate 
of the point of tangency. Check that your result has the right sign 
when c is positive, also makes sense when c < 0, has the correct 
trend as c gets closer to zero, and does something appropriately 
nasty if c= 0. 


u3 Let the functions f and g be defined by f(x) = 2x? and 
g(x) = x* +c, where c is a constant. If c = 0, then the two functions 
are tangent to each other only at the origin. Find the only nonzero 
value of c such that they are tangent somewhere else. v 
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Use the €-d definition to prove the limits in problems wi-w2. The 
good news is that these limits were chosen to be the easiest possible 
examples to prove directly from the definition. The bad news is that 
these may feel like artificial exercises, since the functions are contin- 
uous and defined at the relevant points, so that the limits could have 
been more easily determined by simply plugging the number into the 
formula. The reason for doing them is that they will help you to 
understand the definition of the limit. 


wl 
lim 27 — 4 = —2 
al 
w2 
lim /z = 0 
x0 
w3 Compute 


: gah 
lim x sin — 
x0 x 

and prove your result directly from the e— 6 definition. If you don’t 
remember the properties of the sine function, consult section 5.3, 
p. 128. 


yl —_ Generalize the product rule from two factors to three. Cf. prob- 


lem y6. > Solution, p. 230 
y2 Is it true that if lim,,, f(x) exists then f is continuous at 
r=a? 


y3 The number 1 can be defined as the smallest positive integer. 
(a) Recall that rational numbers are defined as the ratios of integers, 
i.e., fractions such as 2/3. Give a proof by contradiction to show that 
there is no smallest positive rational number. Proof by contradiction 
was introduced in box 2.1 on p. 47. (b) Suppose that someone 
proposes interpreting a symbol like dx as the smallest positive real 
number that exists. Assume the properties of the real numbers given 
in section 1.6, p. 25. Prove that there is no such least real number. 


y4 The factorial n! = 1-2...n was introduced in sec. 2.10, p. 66, 
and proof by induction in sec. 2.6.1, p. 58. Prove by induction that 
n! > n? for n> 4. 


Limits; techniques of differentiation 


yo Let f(x) = x”, where n is an integer greater than or equal 
to 1, and suppose that we want to evaluate f’(1) directly using 
the definition of the limit, i.e., using the brute-force technique of 
example 5, p. 51. This will involve multiplying out the expression 
(1+Az)"—1, after which we end up throwing away everything except 
for the lowest-order nonvanishing term (i.e., the term with Az to 
the first power). All we really need is the coefficient of this term, 
which in example 5 was 2. For a particular value of n, we could just 
go ahead and multiply out this expression, but suppose we would 
rather prove the result for all n. This requires that we prove a 
general result for the coefficient of the linear term in the expression 
(1+ Az)". Such a coefficient is called a binomial coefficient. Proof 
by induction was introduced in section 2.6.1, p. 58. Use a proof by 
induction to show that the binomial coefficient we’re talking about 
equals n. 


y6 Proof by induction was introduced in section 2.6.1, p. 58. 
Use a proof by induction to generalize the product rule from two 
factors to n factors, where n is any natural number. Cf. problem 
yl. 


y7 Recall from p. 60 that a rational function is the quotient of 
two polynomials. Define the nastiness, N[r] of a rational function r 
to be the sum of the orders of its numerator and denominator, when 
it has already been simplified as much as possible. For example, 
a Ee +1 
x4 —1 
If we take the derivative of a rational function, the result is again 
a rational function. We may get lucky and find that the result can 
be simplified, but in most cases the result will be more complicated 
than the original function, as measured by nastiness. Determine an 
upper bound on N[r’], stated as an inequality in terms of N[r]. 





|=4 +2=6. 
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Chapter 3 
The second derivative 


3.1 The rate of change of a rate of change 


On p. 22 in section 1.5.1, we briefly encountered the idea of the 
acceleration of an object. The acceleration is the rate of change of 
velocity, while the velocity is the rate of change of position. That is, 
the acceleration is the rate of change ...of a rate of change! If that 
seems like a strange concept to you, then you’re in good company. 
After Newton and Leibniz invented the calculus, George Berkeley, 
Bishop of Cloyne, published a brutal critique called “The analyst: a 
discourse addressed to an infidel mathematician.” Berkeley wrote: 


Our modern analysts are not content to consider only 
the differences of finite quantities: they also consider 
the differences of those differences, and the differences 
of the differences of the first differences. And so on ad 
infinitum. 

But the velocities of the velocities, the second, third, 
fourth, and fifth velocities, etc., exceed, if I mistake not, 
all human understanding. The further the mind analy- 
seth and pursueth these fugitive ideas the more it is lost 
and bewildered. 


Although some of Berkeley’s critique was in fact valid, there are 
many situations where it’s perfectly natural to want to talk about 
a change in the rate of change. Figure a shows beer fermenting 
energetically at the Timmermans brewery in Belgium. Anyone who 
has watched this delightful process has seen the same story play 
itself out. A small population of dormant yeast cells is dumped 
into a delicious broth of malted barley. They find themselves in 
an ideal environment in which to raise children. At first the signs 
of fermentation are modest: a few bubbles as the small group of 
colonists starts to convert sugars to alcohol and carbon dioxide. 
But by the next morning the happy flood of procreation is going 
like crazy. A flood of foam is gushing out of the fermentation vessel. 


In this example there is nothing more natural than to say: the 
fermentation is speeding up. Let y be the amount of carbon dioxide 
that has been produced so far. (We could just as well have defined 
y as the amount of alcohol, but the CO2 bubbles are what we see.) 
Then the derivative of y with respect to time, y’, is the rate of 
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a/ Beer is a natural food that is 
high in vitamin E. 
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b/The functions y = 2x, x? 
and 7x?. 


4a 











c/The functions y = x? and 
3 — x. 











d/The functions y = x? has 
an inflection point at x = 0. 


change of y. When we say that fermentation has sped up, we’re 
talking about y’. At the time shown in figure a with a dotted line, 
y” is large and positive. One way to tell this is that the slope of the 
y’ graph is large and positive at this moment. In this stack of three 
graphs, the slope on each graph corresponds to the value of the one 
below at any given time. 


In modern terminology, y” is referred to as the second derivative 
of y. 


3.2 Geometrical interpretation 


The second derivative can be interpreted as a measure of the curva- 
ture of the graph, as shown in figure b. The graph of the function 
y = 2a is a line, with no curvature. Its first derivative is 2, and its 
second derivative is zero. The function x? has a second derivative 
of 2, and the more tightly curved function 7x? has a bigger second 
derivative, 14. 


A positive second derivative tells us that the function is like a 
cup: it holds water. A negative second derivative says that the 
function spills water, like a cup that’s been turned upside-down. 
This distinction is referred to as the concavity of the function. In 
figure c, the function x? holds water. We say that it’s “concave 
up,” and this corresponds to its positive second derivative. The 
function 3 — x”, with a second derivative less than zero, is concave 
down. Another way of saying it is that if you’re driving along a 
road shaped like x”, going in the direction of increasing x, then 
your steering wheel is turned to the left, whereas on a road shaped 
like 3 — 2? it’s turned to the right. 


Figure d shows a third possibility. The function x? has a deriva- 
tive 3x7 and a second derivative 6x, which equals zero at « = 0. 
This is called a point of inflection. The concavity of the graph is 
down on the left side, up on the right. The inflection point is where 
it switches from one concavity to the other. In the alternative de- 
scription in terms of the steering wheel, the inflection point is where 
your steering wheel is crossing from right to left. 


Definition 
A point of inflection is one at which the second derivative 
changes sign. 


A circle Example 1 
Consider the set of all points (x, y) at a fixed distance r from the 
origin. This is a circle of radius r. Using the Pythagorean theo- 
rem, we find that this set of points is defined by x? + y? = r?. It 
is not the graph of a function, since it fails the vertical line test. If 
we solve for y, we get y = +r? — x, and since we have both 
a positive and a negative square root, there are two possible val- 
ues of y. But if we arbitrarily choose the positive root, we have 
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the function 
yer [p2 — x2, 


which is the equation of the semicircle lying above the x axis, 
figure e. 


To find the derivative y’, we can rewrite y as (r2— x2)'/? and apply 
the power rule and the chain rule. The result is 


Veo =e: 


The second derivative is 


y” = er _ xe) Ne eae KE os xe) 8/2, 


Let’s evaluate the second derivative at x = 0. The result is y” = 
—1/r. The negative sign tells us that the graph is concave down. 
The absolute value of the result is 1/r, which is a measure of 
the curvature of the circle; a smaller radius indicates a stronger 
curvature. 


When both f’ = 0 and f” = 0, the second derivative test is 
inconclusive. All three of the functions in figure f have f’(0) = 0 
and f”(0) =0, but we can’t tell purely from this information what 
is going on. In one case it’s a point of inflection, in one it’s a local 
minimum, and in one it’s a local maximum. 


ya? Vase Nee 




















When the second derivative test is inconclusive, we need to find 
some other way to determine what’s going on. One option is graph- 
ing. Another possibility is to determine whether the derivative 
changes sign at the point in question. For example, the function 
x* has as its derivative 47°, and this changes sign from negative to 
positive at x = 0, indicating a local minimum. 
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f/ When both f’ = 0 and f” = 0, 
the second derivative test is in- 
conclusive. 
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g/A zero. derivative _ often, 
but not always, indicates a local 
extremum. Sometimes we have 
a zero derivative without a local 
extremum, and sometimes a local 
extremum with an undefined or 
nonzero derivative. 








C 
20 
10+ 
| amt 
5 10 
h/ Example 2. 


3.3 Leibniz notation 


The Leibniz notation for x” is 


dy 
dx?’ 


The seemingly inconsistent placement of the exponents on the top 
and bottom is actually exactly what we need if we want the units 
to make sense. To see this in a concrete example, consider the 
acceleration of an object expressed in terms of its position a: 


awe 
dt?! 


The units of x are meters, and the units of t are seconds. The 
velocity dz/dt has units of meters per second, m/s. The rate at 
which the velocity changes has units of meters per second per second, 
m/s/s or m/s?. This is exactly what is suggested by the Leibniz 
notation. 


3.4 Applications 
3.4.1 Extrema 


When a function goes up and then smoothly turns around and 
comes back down again, it has zero slope at the top. A place where 
y’ = 0, then, could represent a place where y was at a maximum. Or 
the function could be concave up, in which case we’d have a mini- 
mum. Figure g reprises some of the possible types of extrema alluded 
to briefly in section 1.5.3, p. 24. By testing the second derivative, 
we can distinguish among cases B, D, and H, which represent, re- 
spectively, a minimum, a maximum, and a point of inflection. The 
test will not distinguish between D, which is a global maximum, and 
G, which is only a local maximum. 


The second derivative test applied to order quantity Example 2 
In example 12 on p. 59 we analyzed a situation in which a retailer, 
when it runs out of inventory, orders a quantity g of widgets from 
the wholesale supplier. The result was that the retailer's yearly 
cost was given by a function of a certain form, of which an exam- 
ple is 


9 
C=1+—+q. 
q q 
By setting the first derivative 
dc _» 
dg = —9q +1 


equal to zero and solving for g, we find gq = 3. This could be a 
minimum (good), a maximum (bad), or an inflection point. One 
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way to tell is by applying the second derivative test. The second 
derivative is 
eC 


dq2 = 18q~°. 


Plugging in q = 3, we find d*C/dq? = 18/27, which is positive. 
Therefore the function is concave up at g = 3, and this is indeed a 
minimum. (In fact, this particular function happens to be concave 
up everywhere. We only defined it for q > 0, because a nega- 
tive gq doesn’t make sense in this context — the retailer doesn’t 
produce widgets, and can’t sell them to the wholesaler. For any 
positive value of g, the second derivative is positive.) 


One minimum and one maximum Example 3 
> Locate all extrema of the function 


y=x '4+x. 


Use the second derivative test to determine which are maxima 
and which are minima, and check your result by graphing. Are 
these global extrema, or only local ones? 


> This function is undefined at x = 0 because x~' blows up as 
X approaches zero. However, if there are extrema that occur at 
x #0, where the function is smooth, we should be able to find 
them by looking for places where y’ = 0. We have 





which equals zero at x = +1. These points could be maxima, 
minima, or points of inflection. The second derivative is 


yl" = 2x78, 


Plugging in x = +1 gives a positive result, so this is a minimum. 
Plugging in x = —1 gives a negative result, which means that it’s 
a maximum. 


The graph, figure i, verifies the results of the second derivative 
test. The function is odd, so it makes sense that we get a maxi- 
mum and a minimum that are symmetrically disposed. The graph 
also reveals that the extrema we've found are only local ones. 
The function has no global extrema. 
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j/ Example 4. 





k / Isaac Newton (1642-1727). 








‘A fruitless search Example 4 
> Locate all local extrema of the function 


y =x? — 6x? + 12x. 


Use the second derivative test to determine which are maxima 
and which are minima. 


> The function is smooth everywhere, so any extrema must be at 
points where the derivative 


y! = 3x? — 12x +12 


vanishes. The quadratic formula tells us that there is only one 
such point, x = 2. The second derivative 


y” =6x —- 12 
is zero at this point, so it’s a point of inflection, not a maximum 
or minimum. This function has no local extrema. (The original 
function can in fact be rewritten as y = (x — 2)? + 8, which gives 


more insight. It’s simply the function y = x°, shifted 2 units to the 
right and 8 units up.) 


3.4.2 Newton’s second law 


The ancient Greek philosopher Aristotle claimed that force was 
required in order to create motion, and this seemed reasonable to 
Europeans for a thousand years afterward, since it was in accord 
with everyday experience. Although Aristotle didn’t use equations, 
we can imagine putting his theory into mathematical form like this: 
da 
dt 
Here F is the force exerted on an object, x is the object’s position, 
and m is a constant of proportionality, which would presumably be 
a measure of the object’s size, mass, or inertia. 


F=im [“Aristotle’s law of motion” | 


Aristotle was wrong. What he didn’t understand was that fric- 
tion is a force as well. When objects “naturally” slow down, it’s not 
because that’s their automatic tendency but rather because friction 
is acting. The moon doesn’t experience any friction as it orbits the 
earth, so it doesn’t slow down at all. 


Isaac Newton, who was also one of the inventors of the calculus, 
gave a correct account in the form of an equation now known as 
Newton’s second law: 
dx 
dt? 
A force causes an acceleration, not a velocity. In Newton’s second 
law, F' represents the sum of all the forces acting on the object of 
interest. For example, when you drive on the freeway at constant 
speed, your acceleration is zero. This is because the total force 
acting on your car is zero. The forward force generated by the tires’ 
traction on the road is canceled out by backward forces such as air 
resistance. 


P= a7 [Newton’s second law] 
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3 Indifference curves 





The concept of an indifference curve was introduced in example 
2, p. 18. To recapitulate briefly, the person whose indifference curve 
is drawn in figure | is equally happy having the combination of beer 
and sushi represented by any point on the curve. A very common 
assumption in economics is that indifference curves always have y” > 
0. This means that once you have a lot of something, you value it 
less. The large, negative slope at point P in figure 1 means that this 
person already has plenty of beer, and would trade a lot of beer for 
a small amount of sushi. The small negative slope at Q indicates 
the opposite. 


When an indifference curve has y/’’ = 0, it’s a line. This indicates 
that each of the two commodities is a perfect substitute for the 
other. For example, most people don’t care whether they buy an 
airline ticket from one airline or another. 


Discussion question 


A __ Figure m shows a person throwing a ball straight up in the air, with 
the corresponding graphs drawn below for the height x and velocity v as 
functions of time. True or false: at the top of the motion, the ball is at rest, 
so it has no motion; you can’t have acceleration without motion, so the 
ball’s acceleration equals zero at the top. 





Higher derivatives 


When we take the derivative of a function f, the derivative f’ is 
itself a function, so it made sense to apply the same operation again 
and find the second derivative f”. We can continue in this way. 
The derivative of the second derivative is called the third derivative, 
written f’”, and so on. 


The nth derivative of f is denoted f(”). Thus 
Peg ee i SP bate 


Leibniz’ notation for the nth derivative of y = f(x) is 
d"y 
dz” 


= f(a). 


Jerk and damage Example 5 
Higher derivatives are often useful; for example, you will need 
them in your second-semester calculus course in order to com- 
pute Taylor series, which are often used in approximating func- 
tions. There are not many examples, however, in which f(”) has a 
direct, intuitive interpretation for n > 2. The best example | know 
of is the following for n= 3. 


It’s very common for a mechanical system to be damaged by vi- 
bration. For example, when a human runs, the impact of the foot 


beer 


sush 


|/ Indifference curves are con- 


cave up. 
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on the ground causes a shock wave to travel up the leg, and run- 
ners frequently suffer from injuries as a result. When a machine 
shop cuts metal, it’s possible for the whole setup to start vibrating 
violently, and if the lathe or mill isn’t shut down promptly, the result 
can be serious damage to the work or the machine. 


Mathematically, what is the variable that measures how likely dam- 
age is to occur in these examples? The motion of an object is 
described using its position as a function of time, x(t). If x is 
a constant, then the object is sitting still and clearly no damage 
can result, so this suggests taking a derivative. But if x’ is con- 
stant, we also expect no damage. This derivative measures the 
velocity, and velocity doesn’t relate to force, acceleration x” does 
(Newton’s second law, section 3.4.2, p. 88). Even an accelera- 
tion, however, does not necessarily lead to damage. When your 
body is subject to a steady acceleration, it just feels like a steady 
pressure, or perhaps, depending on the direction of the accel- 
eration, an increase in your weight. A steady acceleration will 
never cause an object to shake or vibrate. Such an effect can 
only happen if the third derivative x’” is nonzero. This quantity is 
sometimes called the “jerk.” Cf. example 3, p. 159. 














Two examples Example 6 
If f(x) = x* — 2x +3 and g(x) = x/(1 — x) then 
f(x) = x? —2x +38 le al ame 
1 
/ a - / mw 
f(x) =2x —-2 g(x) (xe 
2 
u i i 
2-3 
CAs (3)(y) = 
2-3-4 
(4)(y) = (4)(y) = 


All further derivatives of f are zero, but no matter how often we 
differentiate g(x) we will never get zero. Instead of multiplying the 
numbers in the numerator of the derivatives of g we left them as 
“2-3-4. A good reason for doing this is that we can see a pattern 
in the derivatives, which would allow us to guess what (say) the 
10th derivative is, without actually computing ten derivatives: 
Ho), . 2°9°4°9°6+7°8-9-10 
gr’ (x) = (txt 





In section 1.7 we introduced a variation on the Leibniz notation 
called the operator notation, as in 
d(a?—a) d 


= 3 = 
Ta =a Ae £) = 32° -1. 





Chapter 3. The second derivative 


For higher derivatives one can write 





Be careful to distinguish the second derivative from the square of 
the first derivative. Usually 


d’y , (dy\* 
OOM oe OE i 
dx? * (4) 
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+B x 
5 10 15 20 


Problem b1. 


Problem c4. 


Problems 


al Find the second derivative of 324 — 427 +6 
with respect to z. > Solution, p. 231 


a2 Find the second derivative of 4q? + 3q? + 4q—1 
with respect to q. v 


a3 Find the second derivative of —1lw?® + 5w? +6 
with respect to w. v 


a4 Find the second derivative of c®” — 18c? + 987 
with respect to c. v 


abd Find the second derivative of 10r!9 — 6r® + 7 
with respect to r. v 


bl (a) Use the graph to visually estimate the location of the 
inflection point of the function 


peopel 
x 


(b) Use calculus to find the point exactly. v 


cl Locate any points of inflection of the function x(t) = t? + t?. 
Verify by graphing that the concavity of the function reverses itself 
at this point. > Solution, p. 231 


c2 Functions f and g are defined on the whole real line, and 
are differentiable everywhere. Let s = f + g be their sum. In what 
ways, if any, are the extrema of f, g, and s related? 

> Solution, p. 231 


c3 (a) Consider a function of the form f(x) = x?, where p could 
be any real number. For what values of p is f”(0) well defined? 
Note that there are some special cases where the whole function f” 
vanishes identically. 

(b) Repeat part a for the following function. 


(x) 0 forz<0O 
x)= 
2 zP forx>0 
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c4 A blimp of mass m is initially at rest, and then the pilot 
turns on the propellers. The propellers gradually speed up, and 
while they’re speeding up, the force accelerating the blimp is given 
by F' = kt, where k is a constant. 

(a) If time is measured in units of seconds (s), mass in kilograms 
(kg), and force in kilogram-meters/second? (kg:m/s?) infer the units 
of k (section 1.9, p. 34). 

(b) Show that there is a function of the form x = ct? that satisfies 
Newton’s second law, determine the constants c and p, and substi- 
tute these to find x(t). 

(c) Check that the units of your answer to part b make sense. V 


c5 Suppose that f is an even function, and g is odd. What can 
you say about f” and g”? (Cf. problem m4, p. 43.) 


c6 Suppose we have a list of numbers 71,...%,, and we wish to 
find some number g that is as close as possible to as many of the 
x; as possible. To make this a mathematically precise goal, we need 
to define some numerical measure of this closeness. Suppose we let 
h = (a, — 9)? +...4+ (an — 9g)’, which can also be notated using 
b, uppercase Greek sigma, as h = 3>_, (a; — q)?. Then minimizing 
h can be used as a definition of optimal closeness. (Why would we 
not want to use h = >", (a; — q)?) Prove that the value of ¢ that 
extremizes h is the average of the x;, and use the second derivative 
test to prove that the extremum is a minimum. 


c7 In problem p1 on p. 74, I presented a bell-shaped graph with 
a minimum at f = 0 and a maximum at a nonzero f. Actually, for 
large enough values of b, the global maximum is at f = 0. Find the 
smallest value of 6 for which is happens. v 


c8 The equation 
Bog Sl 
g—1 «+1 «¢-1 

holds for any value of x for which both sides are defined. (There is a 
general method, called the method of partial fractions, for rewriting 
a rational function such as the left-hand side in terms of a sum of 
simpler functions as in our right-hand side.) Compute the third 
derivative of f(x) = 2a/(x? — 1) by using either the left or right 
hand side (your choice) of the equation. Vv 





In problems e1-e3, compute the first, second, and third derivatives 
of the given functions. 


el fe) =(e +1)! v 
e2 g(x) = (a? + 1)4 Vv 
e3 h(a) = Vx —2 Vv 


In problems g1-g6, find the derivatives of 10% order of the given 
function. (The problems have been chosen so that after doing the 
first few derivatives in each case, you should start seeing a pattern 


Problems 
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that will let you guess the 10" derivative without actually computing 
10 derivatives.) You will find it convenient in most of these prob- 
lems to express your results in terms of the notation n!=1-2...n 
introduced in sec. 2.10, p. 66. The problems are in increasing order 


of difficulty. 





gl f(a) =2?+28 Vv 
g2 go) = 1a Vv 
g3 h(x) = 12/(1—-2z) Vv 
g4 k(x) =1/(1 - 22) v 
go (a4) =2/1+2) Vv 
g6 max) = 27/(1—2) Vv 
g7 ‘Find f’(x), f(x) and f®)(a) if 
a ee a 
PEN Ee te 54 10. 0 
Vv 


g8 Proof by induction was introduced in section 2.6.1, p. 58. 
Use induction to prove that 
qrti 
dantiy 
if n > 0 is an integer. 


Suggestion: To get an idea of what’s going on, calculate the deriva- 
tive for the first few values of n. Then formulate a convincing ex- 
planation of what’s going on. Then find a way to reduce case n to 
case n — 1, and formulate a proof by induction. 


g9 Consider the function 
1 
f(z) = 


ge 
If we calculate f(") (0), we seem to get n! (see sec. 2.10, p. 66 for the 
notation and the special case 0! = 1). 





Proof by induction was introduced in section 2.6.1, p. 58. Use in- 
duction to prove that f(”)(0) = nl. 
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Chapter 4 


More about limits; curve 
sketching 


4.1 Properties of the limit 


In ch. 2 we did very few direct computations of limits using the 
epsilon-delta definition. Epsilon-delta proofs are hard work, and by 
building up a more sophisticated set of tools we can usually avoid 
having to apply the epsilon-delta definition directly. 


4.1.1 Limits of constants and of x 


If a and © are constants, then 


lim C=C (P1) 
and 

lim ga, (P2) 

wa 


4.1.2 Limits of sums, products and quotients 


Let F\ and F» be two given functions whose limits for s > a we 


know, 
lim Fi(x) = 1, lim F(x) = Lo. 
«wa «wa 
Then 
lim (Fi (x) + F2(2)) = I, + La, (P3) 
lim (Fi() — Fy(x)) = Li — La, (Pa) 
lim (Fi (x) - F2(x)) = Ly + Le (Ps) 
Finally, if limz¢ Fo(x) 4 0, 
F L 
i eal (Ps) 





aa F5(x) =, fit 


In other words the limit of the sum is the sum of the limits, etc. 
One can prove these laws using the definition of the limit, but we 
will not do this here. However, I hope these laws seem like common 
sense: if, for x close to a, the quantity F\(a) is close to Ly and 
F(x) is close to Lg, then certainly F\(x) + F2(x) should be close to 
Ly +L. 
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Example 1 
In this example we compute several limits, building up from simple 
examples to more complicated ones. 


First let’s evaluate lim,_,. x?. We have 


lim x? = lim x-x 


x2 x32 
=(Jmx)- (mx) BY 
=2:-2=4. 
Similarly, 
lim x? = lim x - x2 
x2 x2 
a . : . 2 . 
= (lim x) (tim x ) (Ps) again 
OSA 6: 
and, by (Pa) 


lim x? — 1 = lim x? — lim 1=4—1 =3, 
x2 x2 x2 


and, by (P4) again, 


lim x? — 1 = lim x— lim1=8-—1=7, 
x2 x2 x2 


Putting all this together, we get 


PR Se eae eat 
x2xX2=1° 2-1 4-1 3 





because of (Pg). To apply (Ps) we must check that the denomi- 
nator (“Lo”) is not zero. Since the denominator is 3, this was all 
right. 


The limit of a square root Example 2 
> Find limy_,2 \/x. 


> Of course, you would think that limy_,2 /x = V2 and you can 
indeed prove this using 5 & ce. But is there an easier way? There 
is nothing in the limit properties which tells us how to deal with 
a square root, and using them we can’t even prove that there is 
a limit. However, if you assume that the limit exists then the limit 
properties allow us to find this limit. 


The argument goes like this: suppose that there is a number L 
with 

lim /x =L. 

x2 


Then property (Ps) implies that 


L? = (im, vx) (lim, vx) = lim, vx: vx = lim x= 2. 


Chapter 4 More about limits; curve sketching 


In other words, L? = 2, and hence L must be either V2 or — v2. 
We can reject the latter because whatever x does, its square root 
is always a positive number, and hence it can never “get close to” 
a negative number like — V2. 


Our conclusion: if the limit exists, then 
lim Vx = V2. 
x32 


The result is not surprising: if x gets close to 2 then \/x gets close 


to V2. 


4.2 When limits fail to exist 


In example 2 we worried about the possibility that a limit lim,_,, g(x) 
actually might not exist. This can actually happen, and in this sec- 
tion we’ll see a few examples of what failed limits look like. First 
let’s agree on what we will call a “failed limit.” 


If there is no number L such that lim,,, f(x) = L, then we 
say that the limit lim,_,, f(x) does not exist. 


The sign function near x = 0 Example 3 
The “sign function” is defined by 


—1 forx <0 e 


andl 





sign(x)=4{0 forx=0 
1 for x >0 


Note that “the sign of zero” is defined to be zero. But does the 
sign function have a limit at x = 0, i.e. does lim,_.9 sign(x) exist? 
And is it also zero? The answers are no and no, and here is why: 
suppose that for some number L one had 





lim sign(x) = L, 
x0 


then since for arbitrary small positive values of x one has sign(x) = 
+1 one would think that L = +1. But for arbitrarily small negative 
values of x one has sign(x) = —1, so one would conclude that 
L = —1. But one number L can’t be both +1 and —1 at the same 
time, so there is no such L, i.e. there is no limit. 


lim sign(x) does not exist. 
x0 


In examples like this one, it is possible to define a one-sided limit; 
see section 4.3.1. 


Section 4.2 When limits fail to exist 


a/ The sign function. 
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b / Example 4. 


The “backward sine” Example 4 
Figure b shows the “backward sine” function f(x) = sin(7/x). Con- 
template its limit as x — 0: 

eee 

tmsin(3). 
When x = 0 the function f(x) is not defined, because its definition 
involves division by x. What happens to f(x) as x > 0? First, 2/x 
becomes larger and larger (“goes to infinity”) as x — 0. Then, 
taking the sine, we see that sin(z/x) oscillates between +1 and 
—1 infinitely often as x — 0. This means that f(x) gets close to 
any number between —1 and +1 as x — 0, but that the function 
f(x) never stays close to any particular value because it keeps 
oscillating up and down. The limit fails to exist, but for a different 
reason than in example 3. 


Trying to divide by zero using a limit Example 5 
The expression 1/0 is not defined, but what about 
1 
lim —? 
x0 X 
This limit also does not exist. Here are two reasons: 


It is common wisdom that if you divide by a small number you get 
a large number, so as x \, 0 the quotient 1/x will not be able to 
stay close to any particular finite number, and the limit can’t exist. 


“Common wisdom” is not always a reliable tool in mathemati- 
cal proofs, so here is a better argument. The limit can’t exist, 
because that would contradict the limit properties (P;) --- (Pe). 
Namely, suppose that there were an number L such that 

lim 2 =i. 

x0 X 
Then the limit property (Ps) would imply that 


es res ; 
iy -9) = (5) m2) £00 


On the other hand 1 -X = 10 the above limit should be 1! A 
number can’t be both 0 and 1 at the same time, so we have a 
contradiction. The assumption that lim,_,9 1/x exists is to blame, 
so it must go. 
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4.2.1 Using limit properties to show a limit does not exist 


The limit properties tell us how to prove that certain limits exist 
(and how to compute them). Although it is perhaps not so obvious 
at first sight, they also allow you to prove that certain limits do not 
exist. Example 5 shows one instance of such use. Here is another. 


Property (P3) says that if both limy;-4 g(x) and limz-+q h(x) 
exist then lim,,, g(x) + h(x) also must exist. You can turn this 
around and say that if lim,_,, g(x) + h(x) does not exist then either 
lim; +a g(x) or limz_,, h(a) does not exist (or both limits fail to 
exist). 


For instance, the limit 
. il 
lim — — 2 
2-0 x 
can’t exist, for if it did, then the limit 


1 1 1 
lim — = lim (— —z+2) = lim (— — 2) + lim x 
r7>0 2X r70°2 r70* 2 x20 


would also have to exist, and we know lim,_59 4 doesn’t exist. 


4.3 Variations on the theme of the limit 


Not all limits are “for x — a”. Here we describe some variations on 
the concept of limit. 


4.3.1 Left and right limits 


When we let “x approach a” we allow x to be larger or smaller 
than a, as long as x “gets close to a”. If we explicitly want to study 
the behavior of f(x) as x approaches a through values larger than 
a, then we write 


asat x—a+0 r—a,xz>a 


lim f(x) or lim f(x) or lim f(x) or lim f(z). 


All four notations are commonly used. Similarly, to designate the 
value which f(a) approaches as x approaches a through values below 
a one writes 


lim f(x) or lim f(x) or lim f(x) or lim f(z). 


x /a xLa— x—a—0 wa,r<a 
The precise definition of these “one-sided” limits goes like this: 


Definition of right- and left-limits 
Let f be a function. Then the right-limit notation 


lim f(e) = L. (1) 
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means that for every ¢ > 0 one can find a 6 > 0 such that 
arr <a) ite) whe 


holds for all x in the domain of f. 
The definition of a left-limit is exactly analogous. When we say 
linen, j7(@e)) = Jb, 2 
lim f(c) (2) 


we mean that for every ¢ > 0 one can find a 6 > 0 such that 
a6 <2 <a => |fte)—L| <e 


holds for all x in the domain of f. 


The following theorem tells you how to use one-sided limits to 
decide if a function f(x) has a limit at x =a. 


Theorem 
The two-sided limit lim f(x) exists if and only if the two one- 
wa 


sided limits 
lim a), and lir n wv 
z\a Fl ) x /a Pl ) 


exist and have the same value. 


4.3.2 Limits at infinity 


So far we have defined the limit of a function f(x) as x gets 
closer and closer to some finite value. It can also be of interest to 
let x become “larger and larger” and ask what happens to f(x). If 
there is a number L such that f(x) gets arbitrarily close to L if one 
chooses x sufficiently large, then we write 


lim f(z) =L 
xw—->0O0 
(“The limit for x going to infinity is L.”) We have an analogous 
definition for what happens to f(a) as x becomes very large and 
negative: we write 
lima, fla) Sb 
xL—+>— CO 


(“The limit for x going to negative infinity is L.”) 


Here are the precise definitions: 


Definitions of limits at infinity 
Let f(x) be a function which is defined on an interval xp < x < 00. 
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If there is a number FL such that for every ¢« > 0 we can find an A 
such that 
xr>A => |f(a)-Ll<e 


for all x, then we say that the limit of f(x) for x > 00 is L. 


Similarly, let f(z) be a function which is defined on an interval 
—oo < © < xq. If there is a number L such that for every « > 0 we 
can find an A such that 


x<—-A => (|f(x)-Ll<e 


for all x, then we say that the limit of f(x) for s — —oo is L. 


These definitions are very similar to the original definition of the 
limit in section 2.1 on p. 47. Instead of 6 which specifies how close 
x should be to a, we now have a number A that says how large 
x should be, which is a way of saying “how close x should be to 
infinity” (or to negative infinity). 


But although these definitions are similar to the original one, 
they are not quite the same. Note that there is no real number 
called oo, and therefore we can’t just take the definition of lim,_,, 
and substitute oo for a. (Cf. rule 2 on p. 65.) 





The limit of 1/x Example 6 
The larger x is, the smaller its reciprocal, so it seems natural that 
1/x + 0 as x > o. To prove that lim,_,.. 1/x = 0, we apply the 
definition to f(x) = 1/x, L=0. 


For a given e > 0, we need to show that 
1 
EU) <eforalx> A (3) 


provided we choose the right A. 


c/ The value of A is large enough 
for the given e. The graph could 
represent the dying vibration of a 
gong as a function of time. Be- 
cause we can find such an A for 
every e, the vibration dies out to 
zero as time approaches infinity. 
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How do we choose A? A is not allowed to depend on x, but it may 
depend on e«. 


Let’s decide that we will always take A > 0, so that we only need 
consider positive values of x. Then (3) simplifies to 
1 


TST <e 
x 


which is equivalent to 
{ 


x>-. 
E 
This tells us how to choose A. Given any positive <, we will simply 
choose , 
A= the larger of 0 and 2 


Then we have |1 — 0| = 4 < « for all x > A, so we have proved 


The properties of the limit given in section 4.1, p. 95, also apply 
to limits at infinity. As with limits at finite x, it is usually more con- 
venient to calculate limits by using these properties than by direct 
application of the definition. 


A rational function Example 7 
A rational function is the quotient of two polynomials: 
AnX" +--+ + aX + AO 


ONS ari xe By. wy) 





The following trick allows us to evaluate the limit of any such func- 
tion at infinity. 


For example, let’s compute 
lien 3x? 43 
x00 5x2 + 7x — 39° 
The trick is to factor x* from top and bottom. You get 
3x2 +3 Wee cs 343/x? 
lim —,——— = lim 
x00 5X2 + 7X — 39 x00 x2 5 +. 7/x — 39/x2 
———_ timyo0(3 + 8/x?) 





(algebra) 


(limit properties) 





At the end of this computation, we used the limit properties (P,) to 
break the limit down into simpler pieces like lim,_,,. 39/x?, which 
we can directly evaluate; for example, we have 


2 2 
1 
lim 39/x2 = lim 39- (5) 2 ( lim 39) (Jim :) = 39.02 = 0. 
Xoo Xoo xX X->0COo Xow X 


The other terms are similar. 
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Another rational function Example 8 
Compute 
2x 


pL 4x3 45° 
We apply the same trick as in example 7 and factor x out of the 
numerator and x° out of the denominator. This leads to 


lim —2 = Jim (= = ) 

x00 4x3 4.5 © x00\x3 445/x8 
: 1 2 

~ din. (G2 ae) 


= Jim (Ja) (sim, 23) 
=0- E 
=0. 


4.3.3 Limits that equal infinity 


Figure d shows a telephone wire strung between two poles, which bh 
sags by some amount / in the middle. By increasing the tension T in 
the wire, we can reduce the sag. That is, the necessary tension T' is 
some function T(h). There is a story, almost certainly apocryphal, 
to the effect that a small-town mayor considered the sagging wires 
unsightly, and instructed the public works department to tighten 
them up enough so that they wouldn’t sag at all. 





d/A telephone wire sags by 
It can be shown that the function T(h) is approximately given an amount h. 

by the equation 

k 

ke 

where k is a constant.! When I ask students what happens to this 

equation when we plug in h = 0, I always get a chorus of “unde- 

fined!” This shows good mathematical training — division by zero 

is indeed undefined — but doesn’t give any real insight into what 

will go wrong when the workers try to carry out the mayor’s plan. 

If we make h smaller and smaller T will get bigger and bigger. By 

making h sufficiently small, we can make T arbitrarily large. The 

important insight here is that a quantity like 1/0 isn’t just unde- 

fined, it’s undefined because it’s infinity, and infinity isn’t a real 

number. If the workers actually try to make h = 0, they will simply 

have to tighten the wires so much that the wires break. 


T= 


Another way of putting this is that the limit limp_,9 T(h) fails 
to exist. Although it’s true that the limit doesn’t exist, we can be 
more descriptive about the reason that it doesn’t. It’s a limit that 
doesn’t exist because it equals infinity. 





'The value of k is WL/8, where W is the weight of the wire and L is the 
horizontal length. The approximation is good if h is small compared to L. 


Section 4.3 Variations on the theme of the limit 103 


Consider the limit 


As x decreases to x = 0 through smaller and smaller positive values, 
its reciprocal 1/x becomes larger and larger. We say that instead of 
going to some finite number, the quantity 1/2 “goes to infinity” as 
x \, 0. In symbols: 


1 
lim — = oo. (5) 
Likewise, as x approaches 0 through negative numbers, its reciprocal 


1/x drops lower and lower, and we say that 1/x “goes to —oo” as 
xz /0. Symbolically, 





e/The function 1/x behaves 
badly near x = 0. 2/70 x 


lim — = —oo. (6) 


The limits (5) and (6) are not like the normal limits we have been 
dealing with so far. Namely, when we write something like 

lim 2? = 4 

x—>2 
we mean that the limit actually exists and that it is equal to 4. On 
the other hand, since we have agreed that oo is not a number (see 
p. 65), the meaning of (5) cannot be to say that “the limit exists 
and its value is oo.” 


Instead, when we write 
jim {ae =e (7) 


for some function y = f(x), we mean, by definition, that the limit 
of f(a) does not exist, and that it fails to exist in a specific way: as 
x — a, the value of f(x) becomes “larger and larger,” and in fact 
eventually becomes larger than any finite number. 


The language in that last paragraph shows you that this is an 
intuitive definition, at the same level as the first definition of limit 
we gave in section 2.1.1, p. 48. It contains the usual suspect phrases 
such as “larger and larger,” or “finite number” (as if there were 
any other kind.) A more precise definition involving epsilons can be 
given, but in this course we will not go into this much detail. 


When a function is going to blow up at a certain point, there are 
two common behaviors. The first is the one shown in figure e for 1/2, 
where the limit is +oo on one side and —oo on the other. If a limit 
is to be more than a one-sided limit, we want it to have the same 
value on the left and right. In this example that doesn’t happen, 
so only the one-sided limits can be described as being positive- or 
negative-infinite: 


lim — = +00 
a\,0 x 
lim — = —oo 
x/0 Xx 


1 
lim — can’t be described as +00 or —oo 
270 2X 
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The function 1/zx?, figure f, exhibits the other frequently encoun- 
tered behavior. Here we have a positive blowup on both sides, so it 
isn’t just the one-sided limits that can be described. 





As a final comment on infinite limits, it is important to realize 
that (7) is not a normal limit, and you cannot apply the limit rules 
to infinite limits. Here is an example of what goes wrong if you try 
anyway. 


Trouble with infinite limits Example 9 
If you apply the limit properties to lim,\.9 1/x = oo, then you could 
conclude 


Poi Stine ino 0 eee. 
xX\,0 X x0 xX\,0 X 


because “anything multiplied with zero is zero.” 


After using the limit properties in combination with this infinite limit 
we reach the absurd conclusion that 1 = 0. The moral of this story 
is that you can’t use the limit properties when some of the limits 
are infinite. 


4.4 Curve sketching 


4.4.1 Sketching a graph without knowing its equation 


The concepts of calculus, such as derivatives, limits, curvature, 
and concavity, can guide us in analyzing the behavior of a function 
even when we don’t know a formula for the function. In economics, 
for example, these concepts are used heavily even though real-world 
data can essentially never be described by a formula. This subsec- 
tion presents four examples in which we can use these concepts to 
sketch a function based on our understanding of how the function 
should behave in real life. 


The time to pay off a loan 


Most people will end up borrowing money at some point in their 
lives, whether it’s credit card debt, a mortgage, a loan to buy a car, 
or a cash advance from a payday loan company. One of the warning 
signs that you may be walking into an exploitative situation is if 
the person trying to sell you the loan emphasizes the low monthly 
payment. Suppose that you’re borrowing $10,000 to buy a car, and 
the monthly interest rate is 1%. Let p be the monthly payment, 
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y=1x? 


f/The function 1/x? blows 
up near x = 0, but in a different 
way than 1/x; it approaches 
positive infinity on both sides. 
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h / The Laffer curve. 


and T the time required in order to pay off the loan. To understand 
what’s going on here, you want to be able to visualize the graph of 
T as a function of p. One fairly tedious way to do this would be to 
find the equation of the function, take a piece of graph paper and 
plot points. Another method would be to use an expensive graphing 
calculator. But your knowledge of calculus gives you a method that 
provides more insight with less work. 


Clearly the smaller the payment, the longer it will take to pay 
off the loan. This tells us that T(p) is a decreasing function; its 
derivative will always be negative. 


If p is large, then you will pay off the loan so quickly that no 
significant amount of interest accrues. Therefore at large values of 
p, we will have T ~ ($10,000)/p. This tells us that limp+.T = 0. 
The graph of T will approach the horizontal axis more and more 
closely as p gets bigger and bigger. We say that the function T(p) 
has a horizontal asymptote at zero. 


Finally, what happens if p is small? Remember, interest on the 
loan is accruing at a rate of 1% monthly, or $100 every month. It 
may sound like a good deal if you’re offered this loan with a low 
monthly payment of $101, but if you take the loan and always make 
the minimum payment, then the principal on the loan will only go 
down by $1 every month. You will die of old age before you pay 
off the car. We can therefore tell that lim,\ g199 J’ = co. This is a 
vertical asymptote on the graph. 


Figure g shows what the graph must look like. 


The Laffer curve 


This example, a famous one, also has to do with money. In 1974, 
economist Arthur Laffer presented the following argument about 
taxes to politicians Dick Cheney and Donald Rumsfeld, sketching 
the resulting graph on a paper napkin. Consider the government’s 
tax revenue as a function of the tax rate. Clearly if the tax rate is 
zero, the government gets zero revenue. Most people would assume 
that the function was a purely increasing one, since raising the tax 
rate would always garner the government more money. 


But, Laffer said, that isn’t so. Imagine that the tax rate was 
100%, so that the government confiscated all of everyone’s earnings. 
Nobody would have any incentive to work, so they would stop work- 
ing, they would earn no taxable income, and revenue would drop to 
zero. Laffer sketched a graph like figure h on a paper napkin for Ch- 
eney and Rumsfeld. There should be some intermediate tax rate, 
he told them, that would produce the maximum revenue. Later, 
when Ronald Reagan became president, he cut taxes on the the- 
ory that the US was already on the right-hand side of the “Laffer 
curve,” so that, counterintuitively, the lower taxes would produce 
higher revenue. The results were not as Laffer had promised; the av- 
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erage annual budget deficit during the Reagan administration was 
$240 billion, compared to $57 billion during the preceding Carter 
administration. 


In calculus terms, our analysis of this function is an example of a 
result called Rolle’s theorem, p. 117. The idea is that if the function 
is smooth, then we expect its derivative to be continuous. If the 
derivative is positive on the left and negative on the right, then it 
must be zero at some intermediate point. This would be the point 
at which the function was maximized. 


Skydiving 


Figure i shows a skydiver’s altitude as a function of time. Early 
in the motion, soon after the person jumps out of the plane, the 
only significant force is gravity, and the person falls with constant 
acceleration (section 1.5.1, p. 22). The drop relative to the initial 
position equals (1/2)at?, which is the equation of a parabola. 


But as the downward (negative) velocity increases, the upward 
force of air friction gets stronger and stronger. In the opposite limit 
of t > oo, the force of air friction gets closer and closer to being 
strong enough to cancel the force of gravity. In this limit, Newton’s 
second law (section 3.4.2, p. 88) predicts an acceleration of zero. 
An acceleration of zero corresponds to constant velocity, so that the 
graph asymptotically approaches a line whose slope is the velocity. 


This graph demonstrates two mathematical properties. It has 
a y-intercept, which is the initial altitude. It also has an oblique 
asymptote, i.e., an asymptotic line that is neither horizontal nor 
vertical. 


A rock-climbing anchor 


For safety, rock climbers and mountaineers often wear a climbing 
harness and tie in to other climbers on a rope team or to anchors 
such as pitons or snow anchors. When using anchors, the climber 
usually wants to be protected by more than one, both for extra 
strength and for redundancy in case one fails. Figure j shows such 
an arrangement, with the climber hanging from a pair of anchors 
forming a “Y” at an angle 6. The usual advice is to make 0 < 90°; 
for large values of 0, the stress placed on the anchors can be many 
times greater than the actual load L, so that two anchors are actually 
less safe than one. 


Consider the stress on the anchor S as a function of 6. For 
physical reasons similar to those discussed in the example of the 
telephone wire (section 4.3.3, p. 103), S must approach infinity as 
6 approaches 180 degrees; no matter how tight the anchor strands 
are made, the carabiner (hook) at the center will never be pulled up 
quite as high as the anchors. 
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duction is increasing 
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k/ Sketching y’ and y” given 
the graph of y. 


At @ = 0, we can see that each anchor strand will support half 
the load. The y-intercept of the graph equals L/2. 


We can gain further insight by extending the range of possible 
values for @ to include negative angles. Physically, this corresponds 
to bringing the anchor strands past one another and swapping the 
roles of the two anchors. Since the physical setup is symmetrical, 
the function S(@) must have the property S(@) = S(—6), ie., it is 
an even function. It might seem pointless to discuss this symmetry, 
but it tells us something important. An argument identical to the 
one in section 1.2.4, p. 17, tells us that based on this symmetry, the 
derivative S” must equal zero at 6 = 0. This means that for small 
values of @, the strain on the anchor will be very nearly the same 
as for 0 = 0, i.e., hardly any greater than half the load. Thus any 
small value of @ is about equally good, but very large values could 
be a deadly mistake. 


4.4.2 Sketching f’ and f” given the graph of f 


In figure k we revisit the example of fermenting beer (section 
3.1, p. 83). (Feel free to mark your place in the book and make 
a trip to the fridge before continuing.) The top panel of the graph 
would probably have been the easiest to sketch starting from scratch. 
Clearly the amount of COg produced starts off at zero, it rises, and 
it must eventually flatten out and approach a horizontal asymptote, 
since the yeast use up all their food and can’t produce any more. 
This kind of vaguely S-shaped curve is in fact encountered in many 
situations, and is often referred to as a “yeast curve.” 


Now suppose we know y and we want to find y’ and y”. The 
basic concept is that the slope of each graph in the stack gives the 
value of the graph below it. The slope of the tangent line to the y 
graph at time A is small and positive, while the slope at B is larger 
and positive. Therefore the values of y' at these times must be small 
and positive, then larger and positive. At time C, the slope of the 
y graph is as great as it will ever be. Therefore the y’ graph has a 
maximum there. The slope of y gets smaller at D and still smaller 
at E, so the value of y/ must taper off correspondingly. 


Now that we’ve sketched the graph of y’, we can continue the 
process and construct its derivative, y’”. At time C the slope of the 
y’ graph is zero, so the value of the y” graph is zero; this is a point 
of inflection. At times earlier than C the slope of y’ is positive, while 
at times later than C it’s negative. Therefore we must have y” > 0 
before C and y” < 0 after. 


We can also relate the properties of the x’ graph directly to those 
of the y graph. The second derivative is a measure of curvature, and 
its sign indicates concavity. The y graph is concave up before C and 
concave down after. This matches up with the signs of x’. 
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Discussion question 


A Figure | shows three stacks of graphs, each of which is supposed 
to represent the position, velocity, and acceleration of an object. Explain 
how each set of graphs contains inconsistencies, and fix them. 








4.4.3 Sketching a graph given its equation 


If we have an equation defining a function, then the following 
procedure is often a fairly efficient way of sketching its graph. Often 
we are especially interested in finding the function’s local maxima 
and minima, including the absolute or global maxima and minima. 
That is, the absolute maximum is the greatest value ever attained 
by the function, and similarly for the absolute minimum. 


1. Find all solutions of f’(x) = 0 in the interval [a,b]: these are 
called the critical or stationary points for f. 


bo 


. Find the sign of f’(a) at all other points. 


ies) 


. Each stationary point at which f’(x) actually changes sign is 
a local maximum or local minimum. Compute f(x) at each 
stationary point. 


aw 


. Compute the values of the function f(a) and f(b) at the end- 
points of the interval. 


5. The absolute maximum is attained at the stationary point or 
the boundary point with the highest value of f; the absolute 
minimum occurs at the boundary or stationary point with the 
smallest value. 


If the interval is unbounded, then instead of computing the values 
f(a) or f(b), you should instead compute limy ++. f(x). 





Section 4.4 


| / Discussion question A. 


Curve sketching 


109 


A B 


rt 


m/The sign of the derivative 
changes at A and B. 


As an example, let’s sketch the graph of the rational function 


f(a) = Ba) 


By looking at the signs of numerator and denominator we see that 


f(z) >0 for0<a<#? 
f(z) <0 for « < 0 and also for x > 3. 


We compute the derivative of f: 
—3a? — 84 + 3 
1 ee ee ee 
(1+ 27) 
Hence f’(x) = 0 if and only if 
—32" —84+3=0, 


and the solutions to this quadratic equation are —3 and 1/3. These 
two roots will appear several times, and it will shorten our formulas 
if we abbreviate 

A=-3and B=1/3. 


To see if the derivative changes sign we factor the numerator and 
denominator. The denominator is always positive, and the numera- 
tor is 








—~3a7 — 82 +3 = 3 (a4 30 1) = 3(a — A)(x — B). 


Therefore 
<0O forr<A 


fi(x)<>0 forA<x<B 
<0 forxz~>B 


It follows that f is decreasing on the interval (—oo, A), increasing 
on the interval (A, B) and decreasing again on the interval (B, oo) 
(figure m). Therefore 


A is a local minimum, and B is a local maximum. 


Are these global maxima and minima? 


Since we are dealing with an unbounded interval we must com- 
pute the limits of f(a) as  — too. We find 





lim f(z)= lim f(#) =-—-4. 


wo wL—-—Co 


Since f is decreasing between —oo and A, it follows that 


f(A) < f(a) < -4 for —co<a<A. 
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Similarly, f is decreasing from B to +00, so 
—4< f(x) < f(B) for B< r< mw. 

Between the two stationary points the function is increasing, so 
f(A) < f(z) < f(B) for A< a <B. 


From this it follows that f(x) has a global minimum when x = A = 
—3 and has a global maximum when xz = B = 1/3. 


A absolute maximum 








absolute minimum 


4.5 Completeness 


4.5.1 The completeness axiom of the real numbers 


Calculus is the study of rates of change (differentiation) and 
how change accumulates (integration, which we haven’t encountered 
yet). What changes is always a function, and the function takes an 
input value that belongs to its domain and gives back an output that 
belongs to its range. The domain and range could in principle be 
sets of integers, rational numbers, real numbers, complex numbers, 
or hyperreal numbers (section 2.9, p. 64). These number systems 
all share many of the same properties, but just as the ocean is the 
natural setting for a pirate story, there is a sense in which the real 
numbers are the natural setting in which to do calculus. Throughout 
this book, without specifically commenting on it so far, we’ve been 
considering only functions that take real-number inputs and give 
back real-number outputs: real functions. 


What’s so special about real functions? We can define functions 
whose inputs and outputs are, say, integers, and such functions are 
of interest in many fields of mathematics. But real functions are 
especially well suited to describing rates of change. As an example, 
the graph in figure o shows the function f(x) = 2 — 27. Let’s say 
this represents the arc of a cannon-ball shot off of a cliff into the 
ocean, where a y coordinate of 0 represents the surface of the water. 
Our geometrical intuition tells us that if the ball starts above the 
water, and later on ends up below it, then there must be some point 
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p/1. The sets P and Q are 
separated on the number line so 
that every point in P is to the left 
of every point in Q. By the com- 
pleteness axiom, a number like 
Z exists. 2. By the completeness 
axiom, the curve f(x) = 2 — x? 
must intersect the axis. The point 
of intersection is z = V2. The 
completeness axiom doesn’t hold 
for the rational numbers, and we 
can see that here because z is 
an irrational number. 


at which it enters the water. In other words, if the graph of the 
function f cuts across the line y = 0, then there must be a point at 
which they coincide. 


But if we consider a set of numbers more restricted than the 
real numbers, this may not happen. For example, suppose we take 
f to be a function whose inputs and outputs are rational numbers. 
Recall that a rational number is any number that can be expressed 
as an integer divided by another integer, e.g., the fraction 2/3. But 
the place where our cannonball crosses sea level has x = V2, which 
is not a rational number. This example shows that the graphs of two 
rational-number functions can cut across one another without ever 
touching! This offends our intuition about rates of change, since 
we expect that if we change a variable smoothly from one value to 
another, it should visit every value in between. 


What is the special ingredient, the secret sauce that allows the 
real number system to avoid such paradoxical results as the one 
about the cannonball? It seems that the reals are somehow more 
densely packed on the number line than the rationals, but how do 
we define this density property in mathematical terms? It can’t be 
any of the elementary properties of the reals (section 1.6, p. 25), 
since the rationals also satisfy all of those properties. We need to 
add a new axiom, which is called the completeness axiom. 


One possible way of stating such an axiom is the following. 


Completeness axiom 
Let P and Q be sets of numbers such that every number in P is 
smaller than every number in Q. Then there exists some number z 
such that z is greater than or equal to every number in P, but less 
than or equal to any number in Q. 


As an example, let P be the set of all numbers x such that 2? < 2, 
and Q the set of x such that 2? > 2. Then the number z would have 
to be V2, which shows that the rationals are not complete. The 
reals are complete, and the completeness axiom can serve as one of 
the fundamental axioms of the real numbers. 
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The completeness axiom is of a fundamentally different char- 
acter than the elementary axioms. The elementary axioms make 
statements such as “for any number z,...” or “for any numbers x 


and y, ...” The completeness axiom says “for any sets of numbers 
P and Q,...” 
Every decimal is a real number Example 10 


Consider the infinite decimal 
3.141592..., 


which is the decimal expansion of 7. We can use the complete- 
ness axiom to prove that this is a real number. Let P be the list 
of rational numbers given by {3, 3.1, 3.14, 3.141, ...}. Let Qbe 
the set of rational numbers that are larger than every number in 
P. Then the real number whose existence is asserted by the com- 
pleteness axiom is exactly 7. Similar reasoning shows that any 
decimal corresponds to some real number (which can be shown 
to be unique). (Note, however, that the same real number can 
have more than one decimal expansion. For example, the infinite 
repeating decimals 1.000... and 0.999... both equal 1.) 


The Archimedean property Example 11 
The Archimedean principle states that there is no positive real 
number that is less than 1/1, less than 1/(1+1), less than 1/(1 + 
1 +1), and so on.? In other words, it says that there are no 
real numbers that are infinitely small, but still greater than zero. 
The Archimedean property can be proved from the completeness 
property. For suppose, to the contrary, that we did have such a 
real number. Then it would be less than 1/10, so its first decimal 
place would be 0. It would also be less than 1/100, so its second 
decimal place would also be zero. Continuing in this way, we find 
that the decimal expansion of such a number must be 0.000..., 
with the zeroes repeating forever. But this is the decimal expan- 
sion of zero, and we already know that every decimal expansion 
corresponds to a unique real number. Therefore our number is 
zero, and this is a contradiction, since we assumed that it vio- 
lated the Archimedean principle, which refers to a positive real 
number. 





2Cf. section 2.9, p. 64. For an application to economics, see rule 3, p. 218. 
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4.5.2 The intermediate and extreme value theorems 


The following two theorems can be proved from the completeness 
property and the elementary properties of the reals, but we will not 
give the proofs here. 


The intermediate value theorem 


Intuitively, the intermediate value theorem says that the real 
numbers aren’t susceptible to paradoxes like the cannonball paradox 
described above. Or, we can say that if you are moving continuously 
along a road, and you get from point A to point B, then you must 
also visit every other point along the road; only by teleporting (by 
moving discontinuously) could you avoid doing so. More formally, 
the theorem says this: 





Intermediate value theorem 

If y is a continuous real-valued function on the real interval 
from a to b, and if y takes on values y; and ye at certain points 
within this interval, then for any y3 between y; and y2, there 
is some real x in the interval for which y(x) = y3. 





| 
1 
| 
Example 12 
> Show that there is a solution to the equation 10* + x = 1000. 

Xx 

qi The intermediate value > We expect there to be a solution near x = 3, where the function 
thoram states. that iCAne- tune: f(x) = 10% xe 1003 is just a little too big. On the other hand, 
through y3. below 1000 on the interval from 2 to 3, and f is continuous, the 
intermediate value theorem proves that a solution exists between 
2 and 3. If we wanted to find a better numerical approximation 
to the solution, we could do it using Newton’s method, which is 
introduced in section 7.2. 


Example 13 
> Show that there is at least one solution to the equation cos x = 
X, and give bounds on its location. 


> This is what’s known as a transcendental equation, and no 
amount of fiddling with algebra and trig identities will ever give 
a closed-form solution, i.e., one that can be written down with 
a finite number of arithmetic operations to give an exact result. 
However, we can easily prove that at least one solution exists, 
by applying the intermediate value theorem to the function f(x) = 
X — cos x. The cosine function is bounded between —1 and 1, so 
f must be negative for x < —1 and positive for x > 1. By the in- 
termediate value theorem, there must be a solution in the interval 
—1 < x <1. The graph, r, verifies this, and shows that there is 
only one solution. 











r/The function x — cosx 
constructed in example 13. 
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Supply and demand Example 14 
Figure s shows two graphs representing the supply and demand 
of some good on a free market. The function D(p) shows the 
quantity that buyers would willingly buy at unit price p. Normally 
Dis a decreasing function: if the price goes up, people don’t buy 
as much. (But cf. problem c4, p. 37.) The function S(p) shows the 
quantity that the seller would willingly offer if the unit price was p. 
Often S is an increasing function. For example, Boeing might only 
be able to produce more passenger jets by paying their workers 
overtime, which would create a cost that they would pass on to 
their customers. 


Suppose that, as in the example shown in the figure, D starts 
out higher than S on the left, but ends up lower than S on the 
right. Then we expect geometrically that if the curves are contin- 
uous, they must cross at some point. This can be proved using 
the same technique as in example 13. We construct a function 
f(p) = S(p) — D(p), which goes from negative to positive. By 
the intermediate value theorem, there must be some point where 
f = 0, meaning that S = D. This crossing point is the free-market 
equilibrium. 


The intermediate value theorem holds for real numbers, but in 
fact neither the price nor the quantity is free to have any real- 
number value. For example, Boeing can’t sell half an airplane. 
In some cases this might mean that the free-market equilibrium 
defined by S = D would not exist. An example might be the Con- 
corde, a supersonic passenger jet, which flew from 1969 to 2003. 
The nonexistence of the market for this plane today may indicate 
that the supply and demand curves now cross at a quantity that 
is greater than O and less than 1, which is not a possible free- 
market equilibrium because the planes can only be sold in whole 
numbers. 


Example 15 
> Prove that every odd-order polynomial P with real coefficients 
has at least one real root x, i.e., a point at which P(x) = 0. 


> Example 13 might have given the impression that there was 
nothing to be learned from the intermediate value theorem that 
couldn’t be determined by graphing, but this example clearly can’t 
be solved by graphing, because we're trying to prove a general 
result for all polynomials. 


To see that the restriction to odd orders is necessary, consider 
the polynomial x? + 1, which has no real roots because x? > 0 for 
any real number x. 


To fix our minds on a concrete example for the odd case, consider 
the polynomial P(x) = x* — x +17. For large values of x, the 
linear and constant terms will be negligible compared to the x 


quantity sold 


unit price, p 


s/ Example 14. 
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function 


x — sin1/x. 


term, and since x? is positive for large values of x and negative 
for large negative ones, it follows that P is sometimes positive 
and sometimes negative. Therefore by the intermediate value 
theorem P has at least one root. 


This argument didn’t depend much on the specific polynomial P 
chosen as an example. The fact that P was positive for large x 
and negative for large negative x followed merely from the fact 
that P was of odd order. Therefore the result holds for all polyno- 
mials of odd order. 


Example 16 
> Show that the equation x = sin 1/x has infinitely many solutions. 


> This is another example that can’t be solved by graphing; there 
is clearly no way to prove, just by looking at a graph like t, that the 
function f(x) = x—sin 1/x crosses the x axis infinitely many times. 
The graph does, however, help us to gain intuition for what’s going 
on. As x gets smaller and smaller, 1/x blows up, and sin 1/x 
oscillates more and more rapidly. The function f is undefined 
at 0, but it’s continuous everywhere else, so we can apply the 
intermediate value theorem to any interval that doesn’t include 0. 


We want to prove that for any positive u, there exists an x with 
0 < x < u for which f(x) has either desired sign. Let n be an 
even integer such that n > 10 and also zn > 1/u. Then clearly 
f(x) is negative at x = 1/(7n + 7/2) < u, since sin1/x = 1 and x 
is small. Similarly, f(x) is positive at x = 1/(7N + 37/2) < u. This 
establishes the desired result. 


The extreme value theorem 


We've seen that that locating maxima and minima of functions 
may in general be fairly difficult, because there are so many differ- 
ent ways in which a function can attain an extremum: e.g., at an 
endpoint, at a place where its derivative is zero, or at a nondifferen- 
tiable kink. The following theorem allows us to make a very general 
statement about all these possible cases, assuming only continuity. 


Extreme value theorem 

If f is a continuous real-valued function on the real-number 
interval defined by a < x < 6b, then f has maximum and 
minimum values on that interval, which are attained at specific 
points in the interval. 


Let’s first see why the assumptions are necessary. If we weren’t 
confined to a finite interval, then y = x would be a counterexample, 
because it’s continuous and doesn’t have any maximum or minimum 
value. If we didn’t assume continuity, then we could have a function 
defined as y = x for x < 1, and y = 0 for x > 1; this function never 
gets bigger than 1, but it never attains a value of 1 for any specific 
value of x. If we didn’t assume a real function, then we could have, 
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for example, the function f(a) = (a? — 2)? defined on the rational 
numbers, which would never attain the minimum value of 0 because 
/2 isn’t a rational number. 


> Example 17 
Find the maximum value of the polynomial P(x) = x° + x?+x +1 
for-5 <x <5. 


> Polynomials are continuous, so the extreme value theorem guar- 
antees that such a maximum exists. Suppose we try to find it by 
looking for a place where the derivative is zero. The derivative is 
3x? + 2x +1, and setting it equal to zero gives a quadratic equa- 
tion, but application of the quadratic formula shows that it has no 
real solutions. It appears that the function doesn’t have a max- 
imum anywhere (even outside the interval of interest) that looks 
like a smooth peak. Since it doesn’t have kinks or discontinuities, 
there is only one other type of maximum it could have, which is a 
maximum at one of its endpoints. Plugging in the limits, we find 
P(—5) = —104 and P(5) = 156, so we conclude that the maximum 
value on this interval is 156. 


4.5.3 Rolle’s theorem and the mean-value theorem 


On p. 106, in the example of the Laffer curve from economics, 
we got a preview of the following intuitively appealing theorem. 


Rolle’s theorem 

Let f be a function that is continuous on the interval |a, }] 
and differentiable on (a,b), and let f(a) = f(b). There there 
exists a point x € (a,b) such that f’(x) = 0. 


Proof: By the extreme value theorem, f attains its maximum 
and minimum values in [a,b]. If both of these are at endpoints, then 
f is a constant function, and the theorem holds trivially. Suppose 
instead that at least one of these extrema is on the interior of the 
interval. Then by the theorem given in section 2.8.3, f’ is zero at 
that point, and the theorem also holds. 














Rolle’s theorem can be straightforwardly generalized to the fol- 
lowing. 


Mean value theorem 

Let f be a function that is continuous on the interval |a, b] and 
differentiable on (a,b). There there exists a point x € (a,b) 
such that 

f(0) = fl@) 


b-a ° 


fi(e) = 


meaning that the derivative equals the average (mean) rate of 
change of the function between the endpoints of the interval. 


“Mean” is just a fancy word for “average.” In general, it’s a mistake 
to try to calculate a rate of change without calculus, using Ay/Az, 
unless the rate of change is constant. The mean value theorem says 
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F(x)=0 





u / Rolle’s theorem. 


v/The mean value 
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fla)=f(b) 





theorem. 
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that just as a broken clock is right twice a day, there is at least one 
point where Ay/Az gives the right answer. 


Proof: Define the function 


enor FO) Og a), 





which is the point-slope form of the line passing through the end- 
points of the graph of f. Define a new function g(x) = f(x) — (x), 
so that g(a) = g(b) = 0. Applying Rolle’s theorem to g, we find 
that there is some point where f(a) = ¢’(x), which is the desired 
result. 














4.6 Two tricks with limits 
4.6.1 Rational functions that give 0/0 


Suppose we want to compute the following limit: 





We first use the limit properties to find 


lim 2? — 22 =O and lima? —4=0. 
L2 r2 

Now to complete the computation we would like to apply the prop- 
erty (P¢) about quotients, but this would give us 


0 
li =H, 
lim f(t) = 5 
The denominator is zero, so we were not allowed to use (P¢) (and the 
result doesn’t mean anything anyway). We have to do something 
else. 


The function we are dealing with is a rational function, which 
means, as mentioned in example 7, p. 102, that it is the quotient of 
two polynomials. For such functions there is an algebra trick that 
always allows you to compute the limit even if you first get 2. The 
thing to do is to divide numerator and denominator by « — 2. In 
our case we have 


x? — 22 = (2 — 2)-2, go? —4 = (2 —2)- (x42) 











so that 
. ; (a —2)-2 ; x 
l =*)] =] . 
After this simplification we can use the properties (P..) to compute 
2 1 
li ee 
HT AC) Raa ee ta 
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4.6.2 The “don’t make 5 too big” trick 


In this section we describe a trick, the “don’t make 6 to too big” 
trick, that is sometimes helpful when we want to evaluate a limit 
directly from the epsilon-delta definition. Say we want to prove that 
lim;-5; 22 = 1. This may not seem to require a fancy proof, since 
obviously plugging in x = 1 gives x? = 1. But since functions can 
be discontinuous, plugging in does not always prove the value of a 
limit. Also, this example will be an excuse to develop a technique 
that can be useful in less trivial cases. 





We have f(x) = 27, a= 1, L = 1, and as usual when computing 
a limit the question is, “how small should |x — 1| be to guarantee 
|z2 —1| < €?” 





We begin by estimating the difference |x? — 1| 
jz? — 1) = |(@-1)(@+1)| = |e41]-\e-1]. 


As x approaches 1 the factor |x — 1] becomes small, and if the other 
factor |x + 1| were a constant (e.g. 2 as in the previous example) 
then we could find 6 as before, by dividing ¢ by that constant. 


Here is a trick that allows you to replace the factor |x + 1| with 
a constant. We hereby agree that we always choose our 6 so that 
6 <1. If we do that, then we will always have 


la—-1)<6é6<1l,ie. |x-1| <1, 
and x will always be between 0 and 2. Therefore 
|a* — 1] =|2+1)-|2—1| < 3[a—1I. 


If we now want to be sure that |z? — 1| < ¢, then this calculation 
shows that we should require 3|x — 1| < ¢, ie. |z — 1| < Ze. So we 
should choose 6 < $e. We must also live up to our promise never 
to choose 6 > 1, so if we are handed an ¢ for which $e > 1, then 
we choose 6 = 1 instead of 6 = $e. To summarize, we are going to 
choose 


1 
6 = the smaller of 1 and 3° 
We have shown that if you choose 6 this way, then |x—1| < 6 implies 
|x? — 1| <€, no matter what € > 0 is. 
The expression “the smaller of a and 6” shows up often, and 
is abbreviated to min(a,b). We could therefore say that in this 
problem we will choose 6 to be 


6= min(1, $e). 
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Example 18 
> Show that lim,_.41/x = 1/4. 


> We apply the definition with a = 4, L = 1/4 and f(x) = 1/x. 
Thus, for any ¢ > 0 we try to show that if |x — 4| is small enough 
then one has |f(x) — 1/4| < e«. 


We begin by estimating |f(x) — }| in terms of |x — 4): 


" |x —4| 1 
~ 14x; |4x| 





alata da 


Ife) - 1/41 =| - a) =| 





As before, things would be easier if 1/|4x| were a constant. To 
achieve that we again agree not to take 5 > 1. If we always have 
5 < 1, then we will always have |x — 4| < 1, and hence 3 < x < 5. 
How large can 1/|4x| be in this situation? Answer: the quantity 
1/|4x| increases as you decrease x, so if 3 < x < 5 then it will 
never be larger than 1/|4-3| = 5. 


We see that if we never choose 5 > 1, we will always have 
lf(x) — $1 < Glx-4| for |x-4| <5. 
To guarantee that |f(x) — i < e we could therefore require 
qwix-4|<e, ie |x—4| <12e. 
Hence if we choose 5 = 12¢ or any smaller number, then |x —4| < 
5 implies |f(x)—4| < e«. Of course we have to honor our agreement 


never to choose 5 > 1, so our choice of 6 is 


5 = the smaller of 1 and 12¢ = min(1, 12¢). 
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Problems 


al Suppose x is a big, positive number. Experiment on a 

calculator to figure out whether /z +1 — /2—1 comes out big, 

normal, or tiny. Try making x bigger and bigger, and see if you 

observe a trend. Based on these numerical examples, form a con- 

jecture about the limit of this expression as x approaches infinity. 
> Solution, p. 232 





a2 If we want to pump air or water through a pipe, common 
sense tells us that it will be easier to move a larger quantity more 
quickly through a fatter pipe. Quantitatively, we can define the re- 
sistance, R, which is the ratio of the pressure difference produced 
by the pump to the rate of flow. A fatter pipe will have a lower 
resistance. Two pipes can be used in parallel, for instance when you 
turn on the water both in the kitchen and in the bathroom, and in 
this situation, the two pipes let more water flow than either would 
have let flow by itself, which tells us that they act like a single pipe 
with some lower resistance. The equation for their combined resis- 
tance is R= 1/(1/R, + 1/R2). 
(a) Analyze the case where one resistance is fixed at some finite 
value, while the other approaches infinity. Give a physical interpre- 
tation. 
(b) Likewise, discuss the case where one is finite, but the other be- 
comes very small. 

> Solution, p. 232 


cl Sketch the graph of the function e~!/*, and evaluate the 
following four limits: 

lim e~l/* 
20+ 
line 
«x—0- 
lim e~l/* 
w~—>+00 
lim e~l/* 
«w—->—CoO 


> Solution, p. 232 
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c2 Compute the following limits. 





lim (x + 3)/492 
z——A4 
(b) 
lim (a + 3)1493 
r——A4 

















lim (sin x)149? 
wL—->0o 
Vv 
c3 Compute the following limits. 
(a) 
fF ue +3 
Pines u2+4 
(b) 
TF w+3 
Psa u2 +4 
(c) 
: 241 
lim 
u soo y? +2 
(d) 
(2u + 1)* 
u-oo (3u? + 1)? 
Vv 
c4 Do the following notations make sense? 
lim 
x /00 
lim 
x\,00 
lim 
x /—0o 
lim 
L\—00 
c5 Give two examples of functions for which lim,\,9 f(x) does 
not exist. 
c6 Find a constant & such that the function 
32r+2 forr<2 
x)= 
f(@) fae for x > 2. 
is continuous. Hint: Compute the one-sided limits. Vv 
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c7 A function f is defined by 


x for < -1 
f(w)=4ar+b for -l1<a2<1 
g?+2 forx>1. 
where a and b are constants. The function f is continuous. What 


are a and 6b? Hint: Compute the one-sided limits. Vv 


c8 Find a rule for determining the number of horizontal and 
vertical asymptotes possessed by the following function. 


1 
i) ax? + bx +e 
> Solution, p. 233 


c9 Find any horizontal and vertical asymptotes of the following 


function. 
a’ + 1234567 


Te) xi +1 


> Solution, p. 234 
c10 Let 








roe etl 224+3\ 
S\N a242 a244) 


Find any horizontal or vertical asymptotes. p Solution, p. 234 


el The galactic empire has been pretty successful at crushing 
the rebel alliance, but there are still rebels laying low, scattered 
around in various solar systems. The empire offers a bounty x for 
the severed head of each rebel that is brought to the Dark Lord. 
Let f be the fraction of the rebels who are caught by the freelance 
bountry hunters. As in the examples in section 4.4.1, sketch the 
function f(x) without knowing its equation. You should be able to 
infer whether or not f’(0) = 0. > Solution, p. 234 


e2 A pendulum is pulled back through an angle 6 and then 
released. It then swings from @ to —@ and back to @ again; this is 
considered one complete oscillation. The time it takes to carry out 
this oscillation is called the period, T. If the pendulum is hung on 
a stiff rod rather than with a string, then @ can be as big as 180°; 
you will find it helpful to consider what happens in the extreme case 
where @ equals 180°. As in the examples in section 4.4.1, sketch the 
function T(6@) without knowing its equation. You should be able to 
infer whether or not T’(0) = 0. 


e3 The rod in the figure is supported by the finger and the 
string. The tension TJ in the string depends on the distance 6 of the 
finger from the free end of the rod. As in the examples in section 
4.4.1, sketch the function T(b) without knowing its equation. The 
domain of the function consists of the physically possible values of 
b that allow the system to be in equilibrium. Discuss the z- and 
y-intercepts. 
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kb 





Problem e3. 











Problem g1. 
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gl The top part of the figure shows the position-versus-time 

graph for an object moving in one dimension. On the bottom part 

of the figure, sketch the corresponding velocity-versus-time graph. 
> Solution, p. 235 


il Let 
1 


TONS sores 


be defined on the interval [—1, 1]. Find any local and global extrema, 
as well as any asymptotes. Sketch the graph. 


i2 Let 
—1° 
Find any local and global extrema, as well as any asymptotes. 


Sketch the graph. 


i3 Let 
g2+1 
z—1- 





f(e) = 


Find any local and global extrema, as well as any asymptotes. 
Sketch the graph. 


k1 Prove the following theorem. Let f be a real function whose 
second derivative is defined and continuous. If f” is sometimes pos- 
itive and sometimes negative, then f has a point of inflection x, and 
f"(x) = 0. Note that f(x) = 0 is not the definition of a point of 
inflection, and that the theorem fails for a function on the rational 


numbers. > Solution, p. 235 
nl Compute the following limits. 
(a) 
_ e+t-2 
lim 
tol t?-1 
(b) 
_ e+t-2 
lim 
tt” t—1 
(c) 
_ @4+t-2 
lim 
to-1 t2-1 
Vv 
n2 Use the «-6 definition to prove the following limit. 
lim x? = 9 
r3 
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Chapter 5 
More derivatives 


5.1 Transcendental numbers and functions 


| 


5.1.1 Transcendental numbers 


Historically, the motivation for expanding the rational numbers 
to form the reals came from the desire to be able to discuss numbers 
like V2 or W7. (The decision was not without controversy. Legend 
has it that Hippasus of Metapontum, who lived in the fifth century 
B.C., proved 2 to be irrational, and that the gods punished him by 
causing him to drown at sea.) We’ve already seen that the complete- 
ness property of the reals (section 4.5, p. 111) guarantees that /2 
is a real number, and more generally one can use the intermediate 
value theorem to prove that roots of polynomials are real. 


However, there are also numbers that cannot be defined as roots 
of polynomials having rational-number coefficients. These are called 
transcendental numbers. In some sense nearly all real numbers are 
transcendental. For example, suppose we generate a random digit 
by some method such as rolling dice, and we let this be the first digit 
in a decimal. Continuing in this way, we keep on generating more 
and more decimal places. If we could continue generating the digits 
indefinitely, then there would be a 100% probability that our number 
would be transcendental. The important mathematical constants 7 
and e (the base of natural logarithms) are transcendental. Although 
transcendental numbers are the most common kind of real number, 
proving whether or not a particular number is transcendental can be 
difficult. Box 5.1 describes the first number that was ever proved to 
be transcendental. It was not until 44 years later that 7 was proved 
to be transcendental. 


An important property of transcendental numbers is that they 
can’t be written using any finite number of symbols in terms of 
rational numbers and the basic operations of arithmetic: addition, 
subtraction, multiplication, division, and roots. This is the reason 
for the name; transcendental numbers “transcend” arithmetic. For 


example, the number 
_—“ VE 2 ANOS Se 


is not transcendental, since it is written in terms of rational num- 
bers and four of the basic operations. (It is a root of the polyno- 


>Box 5.1 A transcendental 
number 


The first number proved to 
be transcendental, by Liouville 
in 1844, was: 


0.110001000000000000000001 ... 


The first one occurs in the Ist 
decimal place, the next in the 
2nd decimal place, the next in 
the 6th, and so on, with the 
sequence of numbers being 1, 
1-2 = 2, 1-2-3 =6,... Without 
going into the formal proof, it’s 
not hard to get an intuitive feel 
for why this number is tran- 
scendental. Since the list of 
numbers 1, 2, 6, ...grows ex- 
tremely rapidly, we find that 
as we continue to write the 
decimal expansion, it gets ex- 
tremely sparse. It’s so sparse 
that if we try to cook up a poly- 
nomial such as P(x) = 2? + 
9x — 1 with Liouville’s number 
x as a root, we are bound to 
fail; 2? and all higher powers 
of x are also extremely sparse, 
and this makes it impossible 
to get them to cancel out and 
give P(x) = 0. For a proof, 
see the Wikipedia article “Li- 
ouville number.” 
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>Box 5.2 A different defi- 
nition of e 


Some people like lagers bet- 
ter than ales, Chicago better 
than Paris, and the following 
better than equation (2) as a 
definition of e: 


il n 
e= lim (1 + =) (1) 
n—-0o n 


The story-line behind (1) is 
something like this. Sup- 
pose your bank account car- 
ries an interest rate of 100%; 
the second 1 in the equation is 
100/100. If the interest is com- 
pounded yearly, then your bal- 
ance goes up every year by a 
factor of (l= ye 2 alt 
it’s compounded monthly at an 
interest rate of 100%/12, then 
the yearly increase is a factor of 
(1+1/12)!? = 2.6. If we let the 
12 become a variable n that ap- 
proaches infinity, then the 2.6 
becomes e. 


Let’s connect this to equa- 
tion (2). Applying the approx- 
imation dy/dz ~ Ay/Az to 
y = e”, we have 


erwl+az 


for small values of x. Let % = 
1/n, where n is large. Then 
AS eles Oh ea touto a allies 
1/n)”, which is consistent with 
equation (1). 


mial P given in box 5.1.) The converse is not true: not all non- 
transcendental numbers can be written using these operations. For 
example, the polynomial 2° — x + 1 has a root « © —1.17, which 
cannot be expressed in terms of arithmetic. 


5.1.2 Transcendental functions 


Similarly, we have functions that are transcendental or not tran- 
scendental. For example, the function 


_ Oe 


fa) = 


is not transcendental because it can be written using the same basic 
operations of arithmetic. The techniques developed in chapter 2 are 
sufficient to differentiate any function that is not transcendental. 
The purpose of the present chapter is to see how to differentiate 
some functions that are transcendental. 


Since the numbers 7 and e are transcendental, it is not surprising 
that the following closely related functions are transcendental: 


sin x 


COS © 


ev 


Inz 


Although the distinction between transcendental and non-transcend- 
ental numbers is of little practical significance (e.g., no real-world 
measurement will tell us whether a stick’s length is transcendental 
or not), the distinction becomes an important one when we come 
to functions, because the methods we know so far will not suffice 
to differentiate a transcendental function. Most of this chapter will 
be concerned with how to extend our methods of differentiation to 
cover these functions. 


5.2 Derivatives of exponentials 


In example 3 on p. 19 and example 6 on p. 51 we found that the 
derivative of an exponential is an exponential: the more bunnies you 
have, the faster you produce baby bunnies; the more credit-card debt 
you have, the faster your debt grows. Furthermore, we were led to 
the conjecture that in the case of “the” exponential function e”, the 
constant of proportionality between the function and its derivative 
was simply one: 

(e*)' =e" (2) 


There is no way to prove this unless we adopt some definition of e. 
In fact equation (2) serves as a perfectly good definition of e. Box 
5.2 connects this to another popular definition. 
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Adopting equation (2) as a definition, application of the identity 
b* = en)" (see equation (9), p. 134) and the chain rule gives the 2 
more general rule 





1:5 
(b")’ = (In b)b” (3) 
1 
for any base b. 0.5 
Caffeine Example 1 
> The concentration of a foreign substance in the bloodstream H | t 
generally falls off exponentially with time as c = coe~'/2, where 6 12 18 24 
Co is the initial concentration, and ais a constant. For caffeine 
in adults, a is typically about 7 hours. An example is shown in a/ A typical graph of the concen- 
figure a. Differentiate the concentration with respect to time, and tration of caffeine in the blood, in 
interpret the result. Check that the units of the result make sense. units of milligrams per liter, as a 


: : function of time, in hours. 
> Using the chain rule, 


This can be interpreted as the rate at which caffeine is being re- 
moved from the blood and broken down by the liver. It’s negative 
because the concentration is decreasing. According to the orig- 
inal expression for x, a substance with a large a will take a long 
time to reduce its concentration, since t/a won't be very big un- 
less we have large t on top to compensate for the large a on 
the bottom. In other words, larger values of a represent sub- 
stances that the body has a harder time getting rid of efficiently. 
The derivative has aon the bottom, and the interpretation of this 
is that for a drug that is hard to eliminate, the rate at which it is 
removed from the blood is low. 


It makes sense that a has units of time, because the exponen- 
tial function has to have a unitless argument, so the units of t/a 
have to cancel out. The units of the result come from the factor 
Of Co/a, and it makes sense that the units are concentration di- 
vided by time, because the result represents the rate at which the 
concentration is changing. 


A base-10 exponential Example 2 
> Find the derivative of the function y = 10”, verifying equation 
(3) directly in the case b = 10. 


> In general, one of the tricks to doing calculus is to rewrite func- 
tions in forms that you know how to handle. This one can be 
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72) 


b/The radian measure of 
the angle 0 is s/r. 





c/The sine of 0 is y/r, the 
cosine x/r. 


LT) 
VY 


d/The sine and cosine de- 
fined on the unit circle, for any 
angle 0. 


rewritten as a base-e exponent: 


ys to" 
Iny = In (10*) 
Iny =xIn10 

yao me 


Applying the chain rule, we have the derivative of the exponential, 
which is just the same exponential, multiplied by the derivative of 
the inside stuff: 


OY _ exiN10 In to 
dx 


= (In 10)10* 





Review: the trigonometric functions 


Before we talk about how to differentiate trig functions, here’s an 
opportunity to refresh your memory on what trig functions are in 
the first place. 


5.3.1 Radian measure 


The presence of numbers like 60 and 360 in our units of mea- 
surement for time and angles dates back to the ancient Babylonians. 
The reason for splitting larger quantities into these numbers of sub- 
divisions is that 60 and 360 are divisible by many small integers, 
including 2, 3, 5, 10, and 12. For practical purposes it’s fine for 
a carpenter to define a right angle as 90°. But it turns out to be 
much less cumbersome when doing calculus to adopt the radian as 
our unit of angle, as defined in figure b. A right angle is 7/2 radians, 
a full circle 27. From the definition we observe that a number with 
“units” of radians is in fact the unitless ratio of two distances. 


5.3.2 Sine and cosine 


Figure c shows a right triangle. The sine and cosine of the angle 
@ are defined as the ratios 
sin? = and 


cos @ = 


B/Sric 


Since these ratios are the same for any two similar triangles, the 
definitions depend only on 9, not on the triangle. 





5.3.3 Arbitrary angles 


Since the above definition assumes a right triangle, it is restricted 
to angles #0 that are between 0 and 7/2 (a right angle). Figure d 
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shows how to generalize this to an angle that is an arbitrary real 
number. The circle is the unit circle, i.e., the circle centered on the 
origin and having radius 1. The angle is by convention measured 
counterclockwise from the x axis; a negative angle would indicate 
a clockwise rotation. The (x,y) coordinates of a point on the unit 
circle at angle @ are (cos @, sin @). 


It is handy to know these facts: 


cos0 = 1 
sin0 = 0 


These do not need to be memorized. They can be recovered instantly 
by visualizing the unit circle. 


The following identities will be needed later in the chapter. 


sin(a + y) = sina cosy + cosxsiny (4a) 





cos(z + y) = cosxz cosy — sinxsiny (4b) 


5.3.4 Other trigonometric functions 


In terms of the same variables defined above, we have the fol- 
lowing additional trigonometric functions: 


tang = % [important] 

x 
csc @ = 1/sin0 [not as important] 
sec@ = 1/cos0 [not as important] 
cot 6 = 1/tané [not as important] 


5.4 Derivatives of trigonometric functions 


5.4.1 Derivatives of the sine and cosine 


Sometimes a variable oscillates back and forth. A weight hung 
from a rubber band will vibrate up and down. The temperature of 
Los Angeles goes down every winter and back up every summer. A 
sinusoidal wave is the most mathematically simple model of such an 
oscillation, and if we want to know the rate of change, we need to 
know how to differentiate such a function. 


So how would we find the derivative of a sine or cosine? Since 
they’re transcendental, they can’t be expressed in terms of simpler 
functions that we know how to differentiate. 

Derivatives at 0 = 0 


Let’s start by finding the derivatives of these functions at zero, 
as shown in figure e. 


Since the cosine is an even function, we have cos’ 0 = 0. 


slope=0 y=cos 8 





y=sin 8 








e/The derivatives of the co- 
sine and sine functions at 0 = 0. 


f/A geometrical method of 
finding sin’ 0. 
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g / Sketching 


the derivative 


of the sine function. 
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What about sin’ 0? The definition of the derivative gives 


sin # — sin 0 





—0 6—0 
_ ij sin 0 
= pa. 26 


In figure f, the definition of radian measure gives @ = s, while the 
definition of the sine function tells us that sin@ = y. Thus the limit 
above becomes 
bey eee oY 
sin’ 0 = lim =. 
00 $ 
If 6 is close to zero, then the lengths y of the vertical line and s 
of the arc should be nearly the same, so we have the small-angle 
approximation sin @ = 6. Our limit is clearly! equal to 1, so we have 
sin OL, 


As a check on our work, we can take a numerical approximation 
to the derivative at 0 = 0, 


sin 0.001 — sin 0 
0.001 
= 0.99999983, 


sin’ 0 & 





[angle in radians] 


which is indeed close to 1. 


A preliminary sketch 


What about the value of sin’ at 6 4 0? Let’s sketch the derivative 
of sin 6 in order to gain some insight. Using the techniques of section 
4.4.2, p. 108, we obtain figure g. At 6 = 0, the slope of the sine 
function is 1, which is as large and positive as it ever gets, so the 
value of the derivative sketched in the bottom graph is large and 
positive. At 2/2 (90 degrees), the sine has its maximum value of 
1, and its derivative is 0. At a, the sine has its largest negative 
derivative. The graph we’re led to draw for sin’ @ looks like the 
cosine function. 


The graph of the cosine function is the same as the graph of the 
sine function except for a shift to the left by a quarter of a cycle. 
Therefore by the shift property of the derivative (p. 16), if the deriva- 
tive of sin is cos, then the derivative of cos must be a cosine function 
shifted to the left by another quarter-cycle, which gives — sin. Curve 
sketching therefore leads us to the following conjectures: 


- 7 
sin = COS 


/ . 
COs. = si: 





'Strictly speaking, we should prove that for the approximation sin ~ 6, the 
error E = 6 — sin@ goes to zero fast enough so that limg.9 E/@ = 0. In fact, 
one can show based on the areas in figure f that |E| < |@?| for |6| < 0.1. 
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Proof of the derivatives of the sine and cosine 


To prove this, let’s apply the definition of the derivative to the 


sine function. : 
sin(x + h) — sina 





sin’ x = lim 
h->0 h 


Making use of the identity sin(a + y) = sinxcosy + cosxsiny 
(p. 129), we find 


sinzcosh+cosxsinh — sinx 





ay) 5 
sin’ x = lim 
h-0 h 
. sinh ; . cosh—1 
= cos z lim —— + sinxz im ————_. 
hoo A h—0 h 
We have already determined these two limits: they are 1 and 0, 
respectively, so sin’ = cos x as claimed. The similar calculation for 
the derivative of cos z is left as an exercise. 


5.5 Review: the inverse of a function 


Some operations can be undone. Others can’t. Computer software 
often has an “undo” function. But what if the operation is mixing 
hot coffee with cold milk? There is no way to undo this operation, 
even in principle, because information has been lost. No matter how 
closely we inspect the mixture, we have no way of determining how 
hot the original coffee was, or how cold the original milk. 


We’ve defined a function as a graph that passes the vertical line 
test, so that every input x corresponds to a single output y. A 
function may or may not be undoable. If every y corresponds to 
a single x, i.e., if the function passes a horizontal line test, then 
it’s undoable, and we call the “undo” operation the inverse of the 
function. The inverse of a function f is notated f~!, where only 
context tells us that we mean the “undoing” of f, rather than 1/f. 








f *(x)=In x 














h/Some functions and their inverses. In each case, the inverse function is found by reflecting the 
graph across the line y = x. 


Geometrically, inverting the function means interchanging the 
roles of « and y, which requires flipping it across the 45-degree 
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diagonal defined by the line y = a, as in figure h. For example, 
figure h/1 shows the “add-one” function defined by f(x) = «+1, 
and the “subtract-one” function f~!(2) = 2 — 1 that undoes it. 


We define a function as a graph that passes the vertical-line test. 
The set of all 2 values for which the graph contains an (xz, y) point 
is called the domain of the function, while the set of such y values 
is its range. That is, the domain is the set of all legal inputs, while 
the range is the set of possible outputs. Sometimes we define a 
particular function using a formula, and this may implicitly restrict 
its domain. For example, if we define 


1 
x—1? 
then by implication the domain is the whole real line except for 
x = 1, which would produce division by zero. 


Y= 


Sometimes there are real-world reasons for restricting the do- 
main of a function. For example, in section 4.3.3, p. 103, we dis- 
cussed the amount of tension JT in a telephone wire that was nec- 
essary in order to make it sag by a height h at the middle. This 
function was of the form T = k/h, where k is a constant. Math- 
ematically this function is well defined for h < 0, but physically 
that would be meaningless, since a cable can only sustain tension 
(T > 0) — only arigid object such as a rod can sustain compression 
(T <0). 


Sometimes by restricting the domain of a function we can make 
it invertible. For example, the function y = x? fails the horizontal- 
line test, so it doesn’t have an inverse function. But if we restrict 
its domain to x > 0, as in figure h/4, then we can define its inverse 
function, which is x = \/y (using the positive root). 


In terms of the composition of functions (section 2.4.3, p. 56), 
the function f o f~! is simply the identity function y = x (perhaps 
with a restriction on its domain and range). The same applies to 
ae 
Discussion question 
A Which of the following four statements are true, and which are false? 
1. For all real numbers x, sin(sin~' xX) =x. 

2. For all real numbers x, sin” ‘(sin xX) =x. 


1 


3. For all real numbers x, tan(tan~ x) = x. 


4. For all real numbers x, tan~'(tan x) = x. 


5.6 Derivative of the inverse of a function 


Suppose that x is how many gallons of gas I buy, and y is how 
much money I pay. Then y is a function of x, and the rate at which 
this function changes, i.e., the price per gallon of gas, in my area is 
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currently about 
cas ey 
Az gallon’ 





It’s valid to measure this rate of change with an expression of the 
form A.../A..., because the rate of change is constant. I might 
also want to know how much gas I can get for each additional dollar 
I’m willing to spend, and this is found by ordinary algebra to be 





If y is a function of x, and the function is invertible, then the 
Leibniz notation suggests that this should hold even for non-constant 
rates of change, i.e., that the derivative of the inverse function is 


dx 1 


dy (ay)’ 
" (t) 
This is in fact correct, with the caveat that when dy/ dz = 0, da/ dy 
is undefined because it blows up to infinity. 


Derivative of a cube root Example 3 
> Let y = x°. Find dx/dy. 


> The function y = x°, figure i, has a well-defined inverse x = y'/8, 











i/ The function y = x. 









which is the cube root, figure j. The derivative of the original pad er 
function is byt y 
dy . -30-20-10 10 20 30 
— =3x". 
dx 
The derivative of the inverse function is 
dx 1 
dy (#) j/ The function x = y1/°. 
4 
~ 3x2 
ea 
= 3% 
If we prefer to express this in terms of y, we can substitute to get 
Om. Nap 
dy 3 
which agrees with the power rule (section 2.6, p. 57). 
This expression holds everywhere except x = 0, y = 0, where 
dx/ dy blows up to infinity. 
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5.7 Review: logarithms 
5.7.1 Logarithms 


The inverse of exponentiation is the logarithm. If 
bP =z, 


then 
log, 2 = p. 
For example, logy 8 = 3, because 23 = 8. 

The number 10 has appeared above as a base, and that’s because 
humans have 10 fingers. There’s clearly nothing all that special 
about 10. It’s an accident of evolution. A number with more cosmic 
significance is e © 2.71818... Exponents and logarithms with base 
e have some nice properties, which we’ll discuss later in more detail. 
Any expression with x in the exponent is called an exponential, but 
e” is “the” exponential function. Sometimes when zx is a complicated 
expression it gets awkward to write it as a superscript, and then we 
write exp(...) instead of e. The logarithm with the special base e 
is called the natural logarithm, notated In. 


5.7.2 Identities 


The following identities are useful. Exponentials and logs are 
inverse operations: 


log, (bY) = x (5a) 
pret? x (5b) 


Logs turn multiplication and division into addition and subtraction: 


log(xy) = log x + log y (6a) 
log(x/y) = log x — logy (6b) 


A log in one base can be changed into a log in another base: 





(7) 


For example, log,,10° = 6, whereas log) 10° = 3. It may be 
convenient to convert a logarithm to a natural log, with c = e: 


Inx 


a (8) 


log, x = 


Similarly, an exponential with an arbitrary base b can be converted 
to an exponential with base e. 


bt = e(in b)a (9) 
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The derivative of a logarithm 


We now know enough to differentiate a logarithm. The natural log 
has the nicest properties, so we’ll start with it. Let 








y =Ina. 
Then 
d 1 
iL sate [derivative of an inverse] 
0 x 
(<5) 
1 
= aa [x = e¥| 
(%) 
1 eR eae . 
= [derivative of the exponential is the exponential] 
€ 
1 : 
— [x = e” again] 
x 


The result is unexpectedly simple. 


Derivative of the natural logarithm 


d Inz 
dx 





1 
x 


This is noteworthy because it shows that there must be an ex- 
ception to the rule that we can always obtain a function that varies 
like x”~! by differentiating something like x”. If we believed that 
this rule was always true, then we would think that we could ob- 
tain the function 2~! by differentiating some function of the form 
(constant)x°. But in fact this doesn’t work, since «° is a constant, 
and the derivative of x° is therefore 0. Figure k shows the idea. 


Derivatives of logs with other bases can be found by using equa- 
tion (8) to convert to a natural log. The result is 





dlog,zx  —s 1 
dx — (Inb)zx 
The power rule for irrational exponents Example 4 


In section 2.6, p. 57, we showed that the power rule (x”)’ = nx”—' 
held for any nonzero integer value of n, and also gave a sample 
of a proof for a fractional exponent. However, the methods used 
there were not capable of proving the result for irrational values 
of n, or of demonstrating it for all rational values in a single proof. 
We now have the ability to carry out the proof in an efficient way 
for any real, nonzero n. 


k/A “ladder” of powers of x. 
Ignoring multiplicative constants, 
differentiation usually just takes 
us one step down the ladder. 
The diagram shows the two 
exceptions. 
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= aninx 
By the chain rule, 
1/The sine and inverse sine dy Fine ht 
functions. a e x 
n 
= x? oa 
x 
= nx", 


(For n= 0, the result is zero.) 


5.9 Derivatives of inverse trigonometric 
functions 


The sine and cosine functions are not invertible, since they fail the 
horizontal line test — in fact, any horizontal line that crosses these 
functions crosses them in infinitely many places. For example, if I 
tell you that I took the sine of some angle, and the sine was zero, 
then the angle could have been any number from the infinite set 
{... — 27,—-7,0,7,2a,...}. But by restricting the domain of the 
sine function appropriately, e.g., to —7/2 < x < 1/2, we can make 
an invertible function and define an inverse sine, figure 1. 


The derivative of the inverse sine can be found straightforwardly 
by using our knowledge of the derivatives of inverses of functions. 
Let y =sin7! x. Then: 


dy ol 


ae) 








[because x = sin y] 
cos y 


= ————— [because (cos y, sin y) lies on the unit circle] 


A similar calculation shows that the derivative of cos~! # is —1/V1 — 2?. 
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5.10 Summary of derivatives of 
transcendental functions 


Given the derivatives of trig and inverse trig functions from sections 
5.4 and 5.9, it is straightforward to extend the list of derivatives to 
include the other familiar trig functions. In this section we provide 
a summary for reference purposes of all of the derivatives of the 
transcendental functions encountered so far. 


(e") =e" (ogy =dia 

(sinx)! = cosx (sin! 2)! = (1 — 2?)-1/? 
(cosx)! = —sinx (cos~! x)! = —(1 — 2?)-1/2 
(tanz)’ = (cosz)~2 (tan-' 2) = (1+ 22)" 


5.11 Hyperbolic functions 


The hyperbolic trig functions are defined as follows. 
: 1 x —2£ 
sinh z = 3 (e —e ) 


cosh x = ; (e* + e*) and 


sinh x 





tanh xz = ’ 
cosh x 


Their inverses can be calculated using the following relations: 
sinh! « =In (« + VY a2 + 1) 
cosh7! a = In (« + 22 — 1) 


1 1+ 2 
tanh“! a = =1 
an. ax sm (+=) 








The derivatives are as follows: 
(sinh x)! = coshx (sinh! x) = (a? + 1)-¥/? 
(cosh x)! = sinh x (cosh~? x)! = (a? —1)-V/? 
(tanh)! = (coshz)~? (tanh! 2) = (1 — 2?)-1 
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Review problems 


al For what set of angles 0 do we have both sin@ < 0 and 
cos@ < 0? > Solution, p. 235 


a2 Let the function f be defined by f(x) = 2° +1. Find an 
expression for the function f~!. Vv 


a3 Evaluate log; \/1/27. v 


Problem b1 does not require any of the new calculus learned in this 
chapter, but does require knowledge of the transcendental functions 
reviewed in it. 


b1 Find the following limits at infinity. Check your results by 
plugging in large numbers on a calculator or by graphing. 
(a) 
sin x 
im ———— 

a—oo sin(x + 7) 
(b) 
Vxi+1 coszx 





lim 
@w— 00 at+3 
(c) 
. Ina 
lim — 
To 
(a) 7 
lim 
x00 COS Z 
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Problems 


cl Differentiate In(2t+ 1) with respect to t. 
> Solution, p. 235 


c2 Differentiate asin(ba + c) with respect to x. 
> Solution, p. 235 


c3 Differentiate the following with respect to x: e”, e®. (In the 

latter expression, as in all exponentials nested inside exponentials, 

the evaluation proceeds from the top down, i.e., e"), not (e%)*.) 
> Solution, p. 235 


c4 The range of a gun, when elevated to an angle 0, is given by 
9 2 
R= av sind cos 0. 
g 


Find the angle that will produce the maximum range. 
> Solution, p. 236 


c5 Prove, as claimed on p. 137, that the derivative of tan? with 
respect to 6 is (cos @)~?. Assume that the derivatives of the sine and 
cosine are already known. > Solution, p. 236 


c6 Show that the function sin(sin(sin z)) has maxima and min- 

ima at all the same places where sin x does, and at no other places. 
> Solution, p. 236 

c7 Find any extrema of the hyperbolic cosine function defined 


on p. 137. > Solution, p. 237 


d1 (a) Let y = In(1+ 2). Find the best linear approximation to 


this function near x = 0. v 
(b) Use the result of part a to approximate the value of In(1.003) 
without a calculator. v 


d2 (a) Let y = cosa. Find the best linear approximation to this 


function near x = 7/2. v 
(b) Use the result of part a to approximate the value of cos(1.5) 
without a calculator. Vv 


d3 (a) Use the graph to visually estimate the location of the 
inflection point of the function 
y =e? — 2”. 


(b) Use calculus to find the point exactly. Vv 








Problem d3. 
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A Mercator projection, prob- 
lem e6. Note the extremely 
exaggerated scale at the poles. 


d4 The function 
y= Br _ 9-a 


has one inflection point. Locate it. v 


In problems e1-e4, differentiate the given functions. 


el sin cos tan x v 
e2 In cos e” v 
e3 exp sin ln x Vv 
e4 tan-! J/Inz Vv 
e5 Differentiate the function x”. Vv 
e6 On a map drawn using a Mercator projection, the y coor- 


dinate on the paper is given by y = atanh~'sin@g, where ¢ is the 
latitude, a is a constant, and the inverse hyperbolic tangent function 
is defined on p. 137. (a) Find the derivative dy/d@, which indicates 
the latitude-dependent scale of the map in the north-south direc- 
tion. (b) The approximations tanhz = x and sinz * « are valid 
for small «. Use these approximations to approximate the behavior 
of y(¢) for small ¢, and use this to check your answer to part a. 


Vv 
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fl A cold bottle of beer is left outside under a shady tree at a 
picnic. Its temperature as a function of time is given by 


T=a—be %, 


where a, b, and ¢ are constants. 
(a) Infer the units of a, b, and c. (For examples of how to do this, see 
section 1.9 on p. 34, example 9 on p. 29, and example 1 on p. 127.) 
(b) Find the derivative dT’/ dt, which measures how fast the beer is 
warming up. Check that its units make sense. 
(c) Interpret both the original equation and your answer to part b 
in the limit where t > oo. 
(d) Interpret the constants a, b, and c physically. 

> Solution, p. 237 


f2 A person is parachute jumping. During the time between 
when she leaps out of the plane and when she opens her chute, her 
altitude is given by an equation of the form 


y=b—c(t+he"*), 


where b, c, and k are constants. Because of air resistance, her ve- 
locity does not increase at a steady rate as it would for an object 
falling in vacuum. 

(a) What units would b, c, and k have to have for the equation to 
make sense? (For examples of how to do this, see section 1.9 on 
p. 34, example 9 on p. 29, example 1 on p. 127, and problem fl 
above.) 

(b) Find the person’s velocity, v, as a function of time. Vv 
(c) Use your answer from part b to get an interpretation of the con- 
stant c. 

(d) Find the person’s acceleration, a, as a function of time. v 
(e) Use your answer from part d to show that if she waits long 
enough to open her chute, her acceleration will become very small. 


£3 If an object is vibrating, and the vibration is gradually dying 
out, its motion (position as a function of time) is typically of the 
form 

a(t) = Acos(wt + d)e~™, 


where A, w, 6, and 6 are constants. 

(a) Infer the units of each of the four constants, and give a physical 
interpretation. (For examples of how to infer the units, see section 
1.9 on p. 34, example 9 on p. 29, example 1 on p. 127, and problem 
fl above.) 

(b) Find the velocity. 

(c) Check that the units of your answer to part b make sense.  V 
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f4 Sometimes doors are built with mechanisms that automati- 
cally close them after they have been opened. The designer can set 
both the strength of the spring and the amount of friction. If there 
is too much friction in relation to the strength of the spring, the 
door takes too long to close, but if there is too little, the door will 
oscillate. For an optimal design, we get motion of the form 


r= cte~*t, 


where x is the position of some point on the door, and c and b are 
positive constants. (Similar systems are used for other mechanical 
devices, such as stereo speakers and the recoil mechanisms of guns.) 
In this example, the door moves in the positive direction up until a 
certain time, then stops and settles back in the negative direction, 
eventually approaching x = 0. This would be the type of motion 
we would get if someone flung a door open and the door closer then 
brought it back closed again. (a) Infer the units of the constants 
b and c. (For examples of how to do this, see example 9 on p. 29, 
example 1 on p. 127, and problem f1 above.) 

(b) Find the door’s maximum speed (i.e., the greatest absolute value 
of its velocity) as it comes back to the closed position. v 
(c) Show that your answer has units that make sense. 


gl Credit card fraud creates costs (including both economic 
costs and inconvenience) for businesses, credit card holders, and 
the credit card companies. If the company institutes a particular 
measure to prevent fraud, it may be able to eliminate some fraction 
of the fraud that would otherwise have occurred. Putting some 
additional measure in place may then eliminate some fraction of the 
remaining fraud, further reducing the total amount. Let the amount 
the company spends on prevention be p. For the reasons described 
above, it’s reasonable to imagine that fraud falls off exponentially 
as a function of p, so that the total cost to the company is 


C(p) = p+ ae~™. 


Here a and 0 are constants, the first term represents the cost of 
carrying out the fraud prevention, and the second term represents 
the cost of the fraud that was not prevented. 


(a) Find the value of p that minimizes the cost. v 
(b) Check that the units of your answer make sense (section 1.9, 
p. 34). 


(c) For what values of the parameters a and b does your answer not 
produce a meaningful result? Check that this makes sense. 

(d) Suppose that legislation forces the credit card company to suffer 
more of the consequences of the fraud, rather than making their 
customers bear the brunt. What change does this imply in the 
parameters of the model? Check that your answer to part a shows 
the right trend when this change is applied. 
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g2 Benjamin Gompertz (1779-1865) was a British mathemati- 
cian and pioneering actuarial scientist, who overcame significant so- 
cial barriers due to antisemitism. We would all like to live forever, 
and actuaries are in the business of telling us that we probably can’t. 
Based on mortality data, Gompertz constructed a model in which 
an initial population N, of babies born at t = 0 becomes at a later 
time ¢t a surviving population 


N = Noel", 
where I’ve simplified the expression by leaving out some constants. 


If you’ve survived to age t, then your probability of dying in the 
coming year is 


AN 
We 
where —AN is the number of deaths per year. Therefore the death 
rate is 
1 dN 
NN dt. 


Show that in the Gompertz model, this death rate is proportional to 
e'. This exponential rate of increase is demonstrated in the figure. 


g3 In problem gl on p. 142, we minimized a function that looked 
like 


y=xtae™, 


where 2, a, and b were all positive. Suppose instead that the function 
had been 
y = 27 + ae, 

with the corresponding quantities still being positive. Using the 
same technique to find its minimum, we obtain an equation of a type 
called a transcendental equation, which cannot be solved exactly 
for x in terms of elementary functions. Use the intermediate value 
theorem to prove that such a minimum nevertheless exists, as long 
as a and 0 are both greater than zero. 


k1 Proof by induction was introduced in section 2.6.1, p. 58. 
Use induction to prove that 
q” 
da” 
To understand what’s going on, you may wish to calculate the first 
few derivatives; however, doing this and observing the pattern does 
not constitute a proof. 


b® = (Inb)"b". 


0.1 
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Problem g2. Probability of 
death in the U.S. in the year 
2003. Note the logarithmic scale 
on the vertical axis. Between 
the ages of about 30 and 95, the 
death rate rises exponentially, as 
shown by the linearity of the data 
on the logarithmic graph. 
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k2 The function 


f@)=e* 

defines the standard “bell curve” of statistics. (Note that exponenti- 
ation is not associative, and that in exponentiation, z¥” means x), 
not (x¥)*; an expression of the latter form is not very interesting, 
since it simply equals a), 


Proof by induction was introduced in section 2.6.1, p. 58. Use in- 
duction to prove that the nth derivative of f is of the form 


f(a) = Pr(a)e2™, 


where P,, is an nth order polynomial. To understand what’s going 
on, you may wish to calculate the first few derivatives; however, 
doing this and observing the pattern does not constitute a proof. 
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Chapter 6 


Indeterminate forms and 
L’Hopital’s rule 


6.1 Indeterminate forms 
6.1.1 Why 1/0 and 0/0 are not morally equivalent 


If you enter 1/0 and 0/0 into your calculator, it probably flashes 
the same error message in both cases. You learned in grade school 
that division by zero is “undefined.” But there are completely dif- 
ferent reasons why these two types of division by zero are undefined. 
Briefly: 


e 1/0 is undefined as a real number because it would have to be 
infinite, and the real number system doesn’t include infinite 
numbers.! 


e 0/0 is undefined because writing this expression doesn’t give 
enough information to say what it equals. 


Suppose that for some real number x, we had 


~=w7. 


0 
Multiplying by 0 on both sides gives a condition 


0 = Ox 


that x should satisfy. But every real number has this property, 
so writing 0/0 doesn’t give enough information to say whether x 
is defined and, if so, what its value is. Expressions of this “not- 
enough-information” type are called indeterminate forms. 


6.1.2 Indeterminate forms from brute force on a limit 


When we try to evaluate a limit, usually our first attempt is 
simply to plug in and see if a number comes out. For example, if 
we want to evaluate 

. l+2 
lim ; 
c>03+2 








we will naturally try plugging in x = 0, get the result 1/3, and we’re 
done. This is not an indeterminate form. But, for example, suppose 





See section 2.9, p. 64, and example 11, p. 113. 


>Box 6.1 More indetermi- 
nate forms 


We will mainly be con- 
cerned with the indeterminate 
form 0/0, but there are other 
ones as well. Suppose we try to 
evaluate the limit 


lim (5 - 0) tan 0 
0 An/2\2 
by plugging in 0 = 7/2. This 
fails because the first factor 
goes to zero, but the tangent 
factor blows up to infinity. This 
is an example of the indetermi- 
nate form 0- oo. The limit is 
defined and equals 1, but plug- 
ging in won’t tell us that. 


The limit 





ines fee = ae 
A= AS.0) 
is an example of the indetermi- 


nate form oo — co. It equals 
Zero. 
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that f(a) = x? and we want to evaluate f’(1). The definition of the 
derivative in terms of a limit gives 


me 
jeaoklae th)? =a 
h-0 h 


’ 


and attempting to plug in h = 0 results in the indeterminate form 
0/0. This limit is well defined; it equals 2. But the indeterminate 
form tells us that the brute-force technique was too crude, and we 
needed to handle the calculation a little more delicately. 

The indeterminate form 0/0 can also be undefined. For example, 


limg\.0 ve = 00. 


6.2 LHopital’s rule in its simplest form 


Every derivative, if defined, can be seen as a case of the indetermi- 
nate form 0/0. Conversely, we can often convert a 0/0-type limit 
into a problem in evaluating derivatives. Suppose that we want to 
calculate a limit of the form 


where u(a) = 0 and v(a) = 0. Then Au = u(x) — u(a) means the 
same thing as u, and similarly, Av equals v. So we can rewrite our 
limit as 


or 
_ Au/Az 
im ——_. 
aa Au/Az 
If v’(a) # 0, then by property Pg of the limit, p. 95, our limit 
becomes 


lings. Ag 
limg +g Av/Ax’ 


which equals 





We have proved the following. 


Indeterminate forms and LH6pital’s rule 


a / Guillaume de LHd6pital (1661-1704) was a French marquis. Born into 
a military family, he eventually became a mathematician because of bad 
eyesight. He wrote the first calculus textbook. As acknowledged in the 
preface, the results given in the book originated with Leibniz and the 
Bernoulli brothers, but LHdpital’s own name has become attached to the 
theorem known as LH6pital’s rule. When students meet the Marquis, they 
always wonder about his name, which looks like the English word “hospi- 
tal.” Actually, he spelled it with an “s,” and it is the same word in French. 
The “H” is silent, and the accent is on the “a.” As French people gradually 
stopped pronouncing the “s,” they stopped writing it, but put the housetop 
accent on the “6” to show what they were leaving out. The family name 
probably comes from an early association with a “hospital,” a word that in 
medieval times had a broader meaning, encompassing institutions such 
as guest-houses for pilgrims and what we would today call subsidized 
public housing. 





Theorem: L’H6pital’s rule (simplest form) 

If wu and v are functions with u(a) = 0 and v(a) = 0, the 
derivatives u/(a) and v’(a) are defined, and the derivative 
u'(a) £0, then 





We will generalize L’H6pital’s rule in section 6.3, p. 148. 


Example 1 
> Evaluate 


_. Ssinx 
lim 5 
x>0X +X 





> Attempting to plug in x = 0 gives the indeterminate form 0/0, 
and this suggests applying LHé6pital’s rule. The derivative of the 
top is cos x, and the derivative of the bottom is 1+3x*. Evaluating 
these at x = 0 gives 1 and 1, so the answer is 1/1 = 1. 


Example 2 
The limit 


is of the form 8, so we can try to apply I’H6pital’s rule. We get 





3x?-x-2 6x-1 a8 
x1 x2 —1 _ 2x te 
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6.3 Fancier versions of LHopital’s rule 


Mathematical theorems are sometimes like cars. I own a Honda Fit 
that is about as bare-bones as you can get these days, but persuading 
a dealer to sell me that car was like pulling teeth. The salesman 
was absolutely certain that any sane customer would want to pay 
an extra $1,800 for such crucial amenities as upgraded floor mats 
and a chrome tailpipe. L’H6pital’s rule in its most general form is 
a much fancier piece of machinery than the stripped-down model 
described in section 6.2. The price you pay for the deluxe model is 
that the proof becomes much more complicated. I'll state the fancier 
versions of L’Hopital’s rule below and give examples, but relegate 
the proofs to a later section and, in one case, a homework problem. 


6.3.1 Multiple applications of the rule 


In the following example, we have to use |’Hopital’s rule twice 
before we get an answer. 


Example 3 


> Evaluate 
lim 1+cos x 
xn (X — m)2 


> Applying I’Hdpital’s rule gives 





—sinx 
2(x — 7)’ 
which still produces 0/0 when we plug in x = 2. Going again, we 
get 
—cosx 1 
Do ey 


This works because of the following generalization of L’Hopital’s 
rule 


Theorem: L’H6pital’s rule (first generalization) 
If u and v are functions with u(a) = 0 and v(a) = 0, and the 
derivatives u’(a) and v'(a) are defined, then 





The difference from the original form of the theorem is that we no 
longer require v’(a) 4 0, and the right-hand side has a limit. In cases 
where v’(a) 4 0, the original form would have been good enough, 
but the general form also works, since the limit on the right-hand 
side can be evaluated simply by plugging in. We will prove this 
more general form of the rule in section 6.3.4, p. 151. 


Indeterminate forms and LH6pital’s rule 


6.3.2 The indeterminate form co/co 


Consider an example like this: 


., 1 +12 
1m. -—————.. 
z—0 14 2/x 


This is an indeterminate form like o0/oo rather than the 0/0 form for 
which we’ve already proved l’H6pital’s rule. L’H6pital’s rule applies 
to examples like this as well. This can be proved by rewriting an 
expression like limu/v, where both wu and v blow up, in terms of 
new variables U = 1/u and V = 1/v. The result is to reduce the 
oo/co form to the 0/0 form. The proof is carried through in section 
6.3.4, p. 151. 


Example 4 


> Evaluate 
1+1/x 


x30 1 +2/x 





> Both the numerator and the denominator go to infinity. Differ- 
entiation of the top and bottom gives (—x~?)/(—2x~*) = 1/2. We 
can see that the reason the rule worked was that (1) the constant 
terms were irrelevant because they become negligible as the 1/x 
terms blow up; and (2) differentiating the blowing-up 1/x terms 
makes them into the same x~2 on top and bottom, which cancel. 


Note that we could also have gotten this result without I’H6pital’s 
rule, simply by multiplying both the top and the bottom of the orig- 
inal expression by x in order to rewrite it as (x + 1)/(x + 2). 


6.3.3 Limits at infinity 

It is straightforward to prove a variant of l’Hopital’s rule that 
allows us to do limits at infinity. We use a change of variable to 
change a limit like lim;_,.. u(x)/v(x) to a new limit stated in terms 
of a variable X = 1/x. The proof is left as an exercise (problem z1, 


p. 154). The result is that l’H6pital’s rule is equally valid when the 
limit is at +oo rather than at some real number a. 





Acme or Glutco? Example 5 
> You have some money, and two choices of what to invest it in. 
A share in Acme, Inc., costs $7, and returns a dividend of $1 per 
year. A share of Glutco costs $30 and gives a dividend of $2 
per year. If we want to compare the long-term value of the two 
investments, a natural way to do it is with the limit 


lim Stee, 
tooo —30 + 2t 


The top represents the net return on Acme, the bottom Glutco. 
If this limit is greater than 1, then Acme is the better long-term 
investment. What is the value of this limit? 
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> Differentiation of the top gives 1, and differentiation of the bot- 
tom gives 2. The limit is therefore 1/2, and you're wiser to invest 
in Glutco. The interpretation is that the constant terms are irrele- 
vant, and in the long run the competition between the numerator 
and denominator is determined by which one grows faster. 
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6.3.4 Proofs 


The simplest form of l’Hopital’s rule was proved in section 6.2, 
p. 146. In this section we prove the generalizations of |’H6pital’s 
rule claimed in sections 6.3.1-6.3.3. 


Change of variable 


As described briefly in sections 6.3.2 and 6.3.3, two of the added 
features of the generalized l|’H6pital’s rule (the form oo/oo and limits 
at infinity) can be proved by a change of variable. To demonstrate 
how this works, let’s imagine that we were starting from an even 
more stripped-down version of |’H6pital’s rule than the one in sec- 
tion 6.2, p. 146. Say we only knew how to do limits of the form 
x — 0 rather than « — a for an arbitrary real number a. We 
could then evaluate lim,-,, u/v simply by defining t = x — a and 
reexpressing u and v in terms of t. 


> Example 6 
Reduce 
. SiINnx 
lim 
x71 X — 7 


to a form involving a limit at 0. 





> Define t = x — 7. Solving for x gives x = t+ 7. We substitute 
into the above expression to find 

_ sinx |. sin(t+7) 

lim: ——— = lim ——_—. 

XoTX—T t50 t 
If all we knew was the — 0 form of I’H6pital’s rule, then this would 
suffice to reduce the problem to one we knew how to solve. In 
fact, this kind of change of variable works in all cases, not just for 
a limit at 71, so rather then going through a laborious change of 
variable every time, we could simply establish the more general 
form in section 6.2, p. 146, with > a. 


The form co /0o 


To see why l’H6pital’s rule works in general for co/co forms, 
let’s try a change of variable on the outputs of the functions u and 
v rather than their inputs. Suppose that our original problem is of 
the form 

. wu 
lim —, 
v 
where both functions blow up.? We then define U = 1/u and V = 
1/v. We now have 
1/U V 


“7 — jim — 
ay im 7 


line =i 
VU 


and since U and V both approach zero, we have reduced the problem 
to one that can be solved using the version of |’H6opital’s rule already 





?Think about what happens when only u blows up, or only v. 
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proved for the indeterminate form 0/0: 


Differentiating and applying the chain rule, we have 


—y 2! 





piel ioe 2 
lim — = lim cae 
v —u-“U 


Since lim ab = limalim b provided that lima and limb are both de- 
fined (property Ps, p. 95), we can rearrange factors to produce the 
desired result — but this only works under the assumption that 
the limit the limits of the two factors on the right do both exist. 
Therefore the above proof works only when limu/v 4 0. This re- 
striction is in fact inessential, and the rule does hold even when 
lim u/v = 0. For some proofs that work in the more general case, 
see https: //tinyurl.com/rw6jdh4. 


Limits at infinity 


As briefly outlined in section 6.3.3, this proof can be done by 
using a change of variables of the form X = 1/z. The proof is left 
as an exercise (problem z1, p. 154). 


Indeterminate forms and LH6pital’s rule 


Problems 
al Verify the following limits. 











oes 

Pe =3 
sols—l 

i l—cos@dé 1 

im SS 
6-0 62 

_ ba? — Ie 

lim = 00 
LOO v 

lim n(n + 1) as 
neon + 2)(n +3) 
ia ax? + ba+c a 
esoodx2+er+ fod 





(Granville, 1911] > Solution, p. 238 


a2 Evaluate 





exactly, and check your result by numerical approximation. 
> Solution, p. 238 


a3 Amy is asked to evaluate 


She applies l’Hopital’s rule, differentiating top and bottom to find 
1/e”, which equals 1 when she plugs in z = 0. What is wrong with 
her reasoning? > Solution, p. 239 


a4 Evaluate 
ue 


im ——__—__ 
u30 eu +e-4¥—2 
exactly, and check your result by numerical approximation. 
> Solution, p. 239 


abd Evaluate . 
sin t 





im 
tort—T 
exactly, and check your result by numerical approximation. 
> Solution, p. 239 


Problems 
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d1 Compute the following limits using l’H6pital’s rule. 
2 
Heel 





i JV 
(a) Be x? — 8xr-9 
6 ites sin 22 JV 
a3n/2 COSXL- 
es COS TX / 





21/2 1—22 


d2 Suppose n is some positive integer, and the limit 
—1+27/2 
lm £282 +a°/ 1 
x0 Ze 


exists. Also suppose L 4 0. What is n? What is the limit L? V 





d3 What happens when you use l|’Hopital’s rule to compute 
these limits? Compare against what you would have gotten by a 
more straightforward method. 


d4 The logical role of counterexamples was discussed in box 1.3, 
p. 20. The following rule sounds very much like l’H6pital’s: 

f(x) f(z) 
g(x) 


are equal. 








exists, then lim also exists, and the two limits 
zL-a 


But this is not always true! Find a counterexample. 


d5 Here is a method for computing derivatives: since, by defi- 
nition, 

ra L—a 
is a limit of the form a we can always try to find it by using 


VHopital’s rule. What happens when you do that? 


zi Section 6.3.4, p. 151, demonstrates the use of changes of 
variable in proving variants on |’Hopital’s rule. As suggested on 
p. 152, do this for limits at infinity, using the change of variable 
Se, 
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From functions to 
variables 


7.1 Some unrealistic features of our view of 
computation so far 


Calculus was invented by Newton and Leibniz, who lived in an era 
when the best tool for calculation was a freshly sharpened quill, 
used for writing down formulas. They had in mind a certain model 
of computation. I’ve introduced you to a related but somewhat 
different, modern model, based on functions. This model doesn’t 
always relate well to reality. 





We defined a function geometrically, as a graph that passes the 
vertical line test. This doesn’t work well in an example like figure 
a. It shouldn’t matter whether we take the photo from one angle 
or another, but if we insist on describing this shape as a function, 
then rotating it makes a huge difference — the difference between 
being able to describe the shape and not being able to. In a/2, y is 
a function of x. In a/3, y isn’t a function of 2; it fails the vertical 
line test. In a/4, x is a function of y, but y isn’t a function of z. 
These distinctions are silly in this context. The x and y coordinates 
are arbitrary, and we shouldn’t treat them asymmetrically. We can 
think of the teacup as a little computer that knows how to compute 
this particular graph. The teacup doesn’t know or care what’s x or 
what’s y; neither x nor y is its “input” or “output.” 


7.2 Newton’s method 


In the teacup-computer’s personal utopia, there is no distinction 
between input and output. But if we want to join the teacup in 
computational nirvana, we have a problem, because we, unlike the 





a/Light inside a teacup makes a 
cusp. Rotating the graph should 
be irrelevant. 


b/ This archaic computing 
device is called a slide rule. Like 
the teacup in figure a, it’s an 
analog computer, and it doesn’t 
have inputs or outputs. Let Abe a 
number on the scale marked “A,” 
and B the number below it on the 
“B” scale. Then with the central 
sliding stick in the position shown 
in the photo, A = 4B. 


155 


156 


teacup, find some functions easier to compute than their inverses. 
For example, every sixth-grade kid in California is supposed to know 
how to take the cube of a decimal number such as 4.43. That is, 
given x, they can compute y = x®. But how many people do you 
know who can invert the function and efficiently obtain x = ~/y 
with paper and pencil? Some functions are computationally cheap 
to evaluate, but computationally expensive to evaluate in reverse.! 


Newton, however, invented a method that allows us to at least 
partially overcome this uninvertibility problem. Newton’s method 
lets us find a good approximation to x for a given y, provided that 
we know how to evaluate both y and dy/ dz for a given x. 


Suppose that we want to find the cube root of 87. We start 
with a rough mental guess: since 4? = 64 is a little too small, and 
53 = 125 is much too big, we guess x & 4.3. Testing our guess, we 
have 4.3° = 79.5. We want y to get bigger by 7.5, and we can use 
calculus to find approximately how much bigger x needs to get in 
order to accomplish that: 


dy Ay 
dz Ax 
Ay 
dy/ dx 
_ Ay 
3a? 
_ Ay 
~ 3x2 
=0.14 


nD 
wm 





Increasing our value of x to 4.3 + 0.14 = 4.44, we find that 4.44? = 
87.5 is a pretty good approximation to 87. If we need higher preci- 
sion, we can go through the process again with Ay = —0.5, giving 


~ AY 
~~ 392 
= 0.14 
x = 4.43 


x = 86.9. 


Ax 


This second iteration gives an excellent approximation. 





™An extreme example is embedded in the cryptography systems that allow 
you to buy something online without worrying that your credit card number 
is being exposed to random people as it hops across the internet from you to 
amazon.com. These algorithms depend on the fact that it is computationally 
cheap to multiply large numbers, but prohibitively expensive to factor a large 
number into its prime factors. 
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The orbit of Mercury Example 1 
> Figure 1 shows the astronomer Johannes Kepler’s analysis of 
the motion of the planets. The ellipse is the orbit of the planet 
around the sun. At t = 0, the planet is at its closest approach to 
the sun, A. At some later time, the planet is at point B. The angle 
X (measured in radians) is defined with reference to the imaginary 
circle encompassing the orbit. Kepler found the equation 


t 
2m=— = X — esinx, 
oT 





where the period, T, is the time required for the planet to com- 
plete a full orbit, and the eccentricity of the ellipse, e, is a number 
that measures how much it differs from a circle. The relationship 
is complicated because the planet speeds up as it falls inward to- 
ward the sun, and slows down again as it swings back away from 
it. 


c/ Example 1. 


The planet Mercury has e = 0.206. Find the angle x when Mer- 
cury has completed 1/4 of a period. 


> We have 
y = x — (0.206) sin x, 


and we want to find x when y = 27/4 = 1.57. As a first guess, we 
try x = 71/2 (90 degrees), since the eccentricity of Mercury’s orbit 
is actually much smaller than the example shown in the figure, 
and therefore the planet’s speed doesn’t vary all that much as it 
goes around the sun. For this value of x we have y = 1.36, which 
is too small by 0.21. 


Ay 
dy / dx 
0.21 
ee (0.206) cos x 
= 0.21 


AX & 








(The derivative dy/dx happens to be 1 at x = 7/2.) This gives 
a new value of x, 1.57+.21=1.78. Testing it, we have y = 1.58, 
which is correct to within rounding errors after only one iteration. 
(We were only supplied with a value of e accurate to three sig- 
nificant figures, so we can’t get a result with precision better than 
about that level.) 


Usually the series of estimates 29, 21, ®2, ... provided by New- 
ton’s method converges, meaning that limp... Zp exists. Further- 
more, the convergence is often very rapid, so that only a few itera- 
tions are needed to get excellent precision. But as explored further 
in problem z1, 171, Newton’s method sometimes fails to converge. 
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d/“Give me a lever and a 
place to stand, and | will move 
the world.” — Archimedes 





7.3 Related rates 


Figure d is old and fanciful, but it exemplifies an idea that we use 
every day. We have some machine or mechanical linkage, which 
could be as simple as the corkscrew used to open a bottle of wine, 
or as complicated as the suspension on a fancy sports car. The 
motion of one part of the machine is not independent of the other 
parts. In the simple example of a lever, suppose that the heights? 
of the two ends relative to the fulcrum are A on the left and B on 
the right. Then we have a constraint of the form 


= —k, (1) 


where k is the ratio of the lengths of the arms, and the minus sign 
is because if one end goes up, the other has to come down. In figure 
d, k = 11; of course Archimedes was imagining k as some very large 
number, but the cartoonist had to fit everything. Notice that we 
have no natural reason to call B a function of A or A a function 
of B. If the arm of the lever is perfectly rigid, then all we can say 
is that whatever forces act on the ends, the outcome will satisfy 
the constraint. We don’t have to consider one variable as causing 
the other. (The earth looks more likely to move Archimedes than 
Archimedes is to move the earth.) In (1), I picked one variable to 
be on top and the other on the bottom, but instead of B/A = —11, 
I could just as easily have written A/B = —1/11. 


In examples like this one, we naturally want to know the speed 
of the motion. How fast will the cork come out of the wine bottle? 
How fast will my bike go up a hill if I’m in a certain gear? Based 
on your training so far, you are likely to come up with the following 
answer for the lever. The position A of the load on the left side of 
the lever is a function of the position B of the right end, while B is 
in turn a function of time t. The chain rule therefore gives 


d4_ da dB 0) 
dt dB dt’ 


We know dA/dB, which, based on the constraint, is simply —1/k. 
Next we write down a formula for the function B(t), differentiate it, 
and plug the result in to equation (2). Done. A triumph of calculus. 


Oops. There is no mathematical formula for B(t). The motion 
of the right end of the lever in figure d comes from an old Greek guy 
grunting and muttering curses into his white beard. 


The term “related rates” is used in calculus to refer to the fact 
that we don’t necessarily care whether the function B(t) is known. 
Often it may be of interest simply to know that if B changes at a 
given rate, then A will change at some other rate. These two rates 
are related to each other by the constraint equation (1). 





2These heights should actually be measured along circular arcs. 
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Scuba diving Example 2 
When scuba divers ascend or descend, they have to control how 
fast they go, or else the changes in pressure will be too rapid, and 
they can be killed. Let P be the pressure in units of atmospheres, 
y the depth in meters, and f the time in minutes. We then have 


dP dP dy 

dt dy dt 
Given the density of water and the strength of the earth’s grav- 
ity, dP/dy = 0.1 atm/m. The standard advice is not to ascend 
faster than dy/dt ~ —10 m/min. This implies that a diver’s 
body can safely withstand decompression at a rate dP/dt ~ 
—1 atm/min. 


Cams Example 3 
Cams, like the ones shown in figure e, can be thought of as the 
mechanical realization of the mathematical notion of a function. 
As the cam rotates, the follower rides up and down above it. 


The crankshaft of an engine has its angle ~ determined by me- 
chanical linkages (the piston rods) to the pistons. In a four-stroke 
engine such as the ones in cars, the crankshaft is geared to 
the camshaft so that the camshaft’s angle 0 is constrained by 
8 = ~/2. The camshaft then drives each follower, whose height 
his controlled by a function h(8). This function is determined by 
the shape of the cam. The followers open and close the valves, 
which perform functions such as letting fuel into the cylinders. 
The velocity of the follower is given by 


dh_ dh do de 
dt d@ dg dt’ 


where dg / dt is what we measure on a tachometer. 





Cam 1 in the figure is shaped so that the follower falls at constant 
velocity and rises at constant velocity. This has the disadvantage 
that d*h/d0? is infinite, which would theoretically cause infinite 
acceleration d*h/ df? in the follower at the turn-around points. In 
reality the result would be that the follower would leave contact 
with the cam, and there would be undesirable vibration. 


Cam 2 is shaped according to 
h(6) = 1 - |6| — 1 sin(2\0)) 
“TT: 2 


for 8 € [—7, 7]. This is Known as a cycloid cam. It has the desir- 
able property that all of its derivatives up to the third, d°h/d0, are 
finite, and furthermore that the cycloidal segments of the graph 
can be joined smoothly onto constant (“dwell”) segments without 
losing these properties. For the reasons discussed in example 5, 
p. 89, it is desirable not to have a large third derivative. 

















e/Example 3. Top: a racing 
camshaft from a car. Middle: two 
cams with specific mathematical 
shapes. Bottom: Graphs of 
h(®) and its first and second 
derivatives. 
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f/The equation x? + y? = fr 


does not define a function unless 
we restrict it to an appropriate 
region. 


Watt's linkage 





lemniscate of 
Bernoulli 
(x2 +y?)2=2(x?-y2) 


g / Example 4. 





Implicit functions 


As you read this, the world is turning, and you are moving in a circle. 
Let this circle be centered on the origin, with radius r. Physical 
forces constrain you to stay on this circle, rather than flying up into 
the sky or sinking down into the earth’s core. The Pythagorean 
theorem allows us to write this constraint as the equation 


a typ ar, (3) 


whose solutions are graphed in figure f. This graph fails the vertical 
line test, so y isn’t a function of x, and it also fails the horizontal 
line test, so x isn’t a function of y. Usually by restricting it to a 
small enough region, we can make it into a function. If we restrict 
to region 1, 2, 3, or 5, y is a function of x, and similarly for x as 
a function of y in regions 1, 2, 3, and 4. The largest piece of the 
graph on which equation (3) defines a function is a semicircle. For 
example, we could solve for « and find the function 


r(y) =—vr2—y?, (4) 


where the choice of the negative square root gives the left-hand half 
of the circle. Equation (3) is said to define an implicit function, 
while (4) defines an explicit one. In an example such as this one, it 
would be inconvenient to try to work with explicit functions. For 
example, if we insisted on having explicit functions, we would run 
into hassles because any calculation would have to be broken down 
into special cases covering different regions. 


Watt's linkage Example 4 
Figure g shows a mechanical linkage patented by James Watt in 
1784, and still used in applications such as automobile suspen- 
sions. It consists of a chain of three linked rods that are free to 
rotate about bearings at their ends. The ends of the chain are 
fixed. The purpose of the arrangement is to constrain some ob- 
ject, attached to the center of the middle rod, to move along the 
figure-eight curve shown as a dotted line. In this example, the 
proportions of the three arms are 1 : V2 : 1, so that when the 
central point is at the center of the curve, they outline a square. 
This choice of proportions, along with an appropriate choice of 
scale for the coordinates, can be shown to produce a curve with 
the equation 

(x? + y?)? = 2(x? — y*). (5) 


In a typical application of a Watt linkage, the central point is at- 
tached to the chassis of a car, and the ends are attached to the 
wheels. The linkage is reoriented so that the darkened segment 
of the curve is approximately vertical, and the car’s chassis is 
then constrained so that its motion is nearly vertical. When the 
car goes around corners, the body can’t move sideways. 
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Equation (5) constrains x and y relative to one another, and makes 
either variable an implicit function of the other. The linkage can 
be thought of as a type of computer (an analog computer rather 
than a digital one) that computes the implicit function (5). 


7.5 Implicit differentiation 


We would like to be able to do calculus on implicit functions. As a 
typical application, consider example 4. If vertical motion is desired 
for small displacements from the center, then we want to rotate the 
linkage by the correct angle so that the dark portion of the figure- 
eight curve is vertical near its center. That is, we want to know the 
slope of the tangent line at this point, so that we can rotate the 
tangent line and make it vertical. The slope of the tangent line is 
the derivative, so essentially we need to differentiate a graph that 
represents an implicit rather than explicit function. 


7.5.1 Some simple examples 
An example involving addition 
But let’s start with a simpler example. In figure h, we want 
to find a proportion between the motion of the tractor and stump. 
With some arithmetic, we find 
A+2B — 2l, —% =0, (6) 
which is an implicit relation between A and B. Any change AA in 
the position of the tractor will correspond to some change AB in 
the position of the stump. Setting the change in the left-hand side 
of equation (6) equal to 0, we have 
A(A + 2B — 202 — 41) = 0. 
The change in a sum is the same as the sum of the changes, so 
AA + 2AB — 2Afy — Al, = 0. But the constants don’t change, so 
AA+2AB=0. (7a) 
The tractor moves twice as much as the stump, and the motion is 


such that as A increases, B decreases. All of the following are just 
different ways of expressing the same thought. 





dA 

Gs t+2=0 (7b) 
dB 

dA dB 

—+2—=0 7d 

de ira) 

dA+2dB=0 (7e) 


Equation (7e) says that if (7a) works for ordinary numbers like 2 
meters and —1 meter, then it should also work for infinitely small 
numbers (section 2.9, p. 64). Alternatively, some people like to think 
of an equation like (7e) as nothing more than an informal shorthand 
for equations involving derivatives such as 7b-7d. 
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h/1. Farmer Bill pulls a stump. 
The pulley is a simple machine, 
like the lever of section 7.3. Just 
like the lever, it increases the 
applied force by some factor, 
while decreasing the motion 
by the same factor. 2. In our 
mathematical model, the fixed 
post is assumed to be immovable 
and perfectly rigid, and the ropes 
perfectly unstretchable, so that 
their lengths 2; and @ are con- 
stant. For simplicity, we neglect 
the radius of the pulley. 
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area=v dp 


_area=p dv 





i/A geometrical _interpreta- 
tion of equation (9a). Boyle’s law 
says that the areas of the initial, 
dark rectangle and the final, 
dashed rectangle are the same. 
The area vdp lost in the top strip 
equals the area pdv gained in 
the side strip. 


An example with multiplication 


Boyle’s law states that at a fixed temperature, a sample of an 
ideal gas has its pressure and volume related by 


pu =k, (8) 


where k is aconstant. For example, compressing the gas to a smaller 
volume makes its pressure increase. 


Suppose that the pressure changes from p to p+ Ap, and the 
volume from v to v + Av. Then: 


A(pv) =0 [change in each side of (8); Ak = 0] 
(p + Ap)(v + Av) — pv =0_ [subtract initial pv from final] 
pAv + vAp + ApAv = 0 [distribute and cancel pu terms] 


This messy expression can be cleaned up in the case where Ap and 
Av are small. The product of two small numbers is even smaller, 
and if we make them small enough, their product will always be 
negligibly small compared to them. (Cf. p. 47.) To show that we’re 
now talking about very small numbers, we notate the changes as dp 
and dv. We then have: 


pdv+vudp=0. (9a) 


This looks just like the product rule. In this context, symbols like 
dp and dv are referred to as differentials, and we talk about “taking 
differentials” on both sides of (8) to get (9a). The process of taking 
differentials is no different than the process of taking a derivative. As 
in the example of the pulley on p. 161, there are multiple equivalent 
ways of expressing this statement: 


ea = 9b 
Pap ts (9b) 
Segoe =0 (9c) 
du 
du dp _ 


Some people think of 9a as just a shorthand for (9b)-(9d). 


7.5.2 Implicit differentiation in general 
Reduced to differentiation of functions 


The examples in section 7.5.1 show that no new techniques are 
needed for implicit differentiation. Every fact about differentiating 
a function corresponds to a similar fact about implicit differentia- 
tion. If we wish, we can do implicit differentiation according to the 
following recipe, which reduces it to differentiation of a function: 


1. Take the equation that defines the implicit function and dif- 
ferentiate both sides with respect to something. It doesn’t 
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matter what we differentiate with respect to; it can be one 
of the two variables in the equation, or it can be some other 
variable such as time. 


2. (Optional.) If desired, clear all the factors of 1/ dsomething. 


A circle Example 5 
> The equation x? + y? = r? defines a circle. Implicitly differentiate 
it. 


> It doesn’t matter what we differentiate with respect to, so let’s 
differentiate with respect to t, which lets us imagine that the point 
(x, y) is moving around the circle as time passes. Since ris a 
constant, the derivative of the right-hand side is zero. 


d(x*) | d(y?) 


de de 








Since the expressions x* and y* aren't written in terms of t, we 
need to use the chain rule. 


d(x?) dx d(y?) dy | : 
dx dt dy dt 
dx dy 


OK Tey ape 


dx dy 
Xap tap =o 





We could stop here if we wished, but the factors of 1/dtf are 
messy, and tf wasn’t even a variable in the original statement of 
the problem, so it’s nicer to multiply by dt on both sides. We have 


xdx+ydy =0 (10) 
or, equivalently, 
dy x 
—=--, 11 
bet y (11) 


The form (10) has the advantage that it holds anywhere on the 
circle, whose graph isn’t a function. Some people would prefer 
(11) because they don’t believe in Santa Claus or infinitesimals, 
but it has the disadvantage that it breaks the symmetry between 
X and y, and it doesn’t hold at the two points on the circle where 
y=0. 


An approximation on the circle Example 6 
> The following are two nearby points on the unit circle: 
(0.400000, 0.916515), (0.401000, 0.916078) 


Verify that equation (10) is a good approximation. 
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j/Examples 5 and 6. The 
reason for the unexpectedly 
simple result dy/dx = —x/y 
becomes apparent here because 
the slope of the radius is y/x, 
and the tangent line must be 
perpendicular to the radius. 
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> Since Ax and Ay are small, it makes sense to expect that (10) 
will be approximately correct if we substitute deltas for the differ- 
entials. Let’s see if that’s true. 


xAx + yAy = (0.400000)(0.001000) + (0.916515)(—0.000437) 
= —0.000001 


The approximation is so good that when we round off to six deci- 
mal places, the result almost rounds to zero. 


A little bit of ... 


Although we saw above that implicit differentiation can be re- 
duced to differentiation of functions, this is not necessary in general. 
People who are proficient in calculus don’t go around making up ad- 
ditional variables like the t in example 5. For example, say that a 
square has sides of length u. We can think of d as meaning “a little 
bit of ...,”° so that du is a little bit of a change in the length of 
the square’s sides. Now u? is the area of the square, and d(u”) is a 
little bit of a change in its area. We have a power law that says 


d(u*) = ku*-? du. 
This power law is exactly analogous to the one for a function u(t), 


which, if we apply the chain rule, is 


d(u*) = kyk-1 de 

dt dt 
Obviously neither of these needs to be memorized separately from 
the other. Expressions like du and d(u?) are known as differentials. 


Differential of a polynomial Example 7 
> Find the differential of s* + s, and use it to approximate the 
change in this expression as s changes from 1.000 to 1.001. 


> For differentiation we have a rule that the derivative of the sum 
of two functions is the sum of the derivatives. The analogous rule 
for differentials is that the differential of a sum is the sum of the 
differentials. Therefore 


d(s* + s) = d(s*) + ds. 


Likewise we have a power rule for differentials that corresponds to 
the power rule for derivatives, and the case of the second power 
was discussed in detail above. We therefore find 


d(s* +s) = 2sds+ds. 
The numerical approximation is 


A(s? +s) ~ (28 + 1)As = (3)(0.001) = 0.003. 





°The phrase is due to the direct and unpretentious Silvanus Thompson, au- 
thor of a best-selling 1910 calculus textbook. 
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The power law for fractional exponents Example 8 
In section 2.6.3, p. 58, we gave a proof, using only the elemen- 
tary rules of calculus, that the derivative of x'/? was }x~'/?, as 
expected from the power rule. We remarked that although it was 
clear that such an argument could be constructed for any frac- 
tional exponent, that was not the same as giving a general proof. 
We can write such a proof using implicit differentiation. (We have 
already proved this fact for any real exponent, using the exponen- 
tial function, in example 4 on p. 135.) 


Let n = p/q where p and q are integers and let 
y= xP/q_. 


By raising both sides to the power p, we can make this into an 
implicit function that uses only integer exponents. 


y? = xP. 
Implicit differentiation gives 
gy?! dy = px?" dx. 


We then have 





oy Oe 
dx qy@1 
= P yp-1y—(p/a)(q-1) 
= P p/q- 
q 


Example 9 
Let y = f(x) be a function defined by 


2y+siny —x=0. 


(We encountered a function of this form in a real-world applica- 
tion in example 1, p. 157.) It turns out to be impossible to find a 
formula that tells you what f(x) is for any given x (i.e., there’s no 
formula for the solution y of the equation 2y + sin y = x.) But you 
can find many points on the graph by picking some y value and 
computing the corresponding x. 


For instance, if y = 7 then x = 27, so that f(27) = 7: the point 
(271, 7t) lies on the graph of f. Let’s find how small changes in x 
and y relate to one another near this point. 


Taking differentials on both sides of the defining equation, we 
have 
2dy +cosydy —dx=0 
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k/Example 9. The graph of 
X = 2y+siny contains the point 
(27, 7). What is the slope of the 
tangent line at that point? 











-20 1 


|1/Example 10. The graph of 
X+COS X = y + eY passes through 
the origin. Its slope there is 1/2. 


or 
(2 +cos y) dy — dx =0. 


We were thinking of y as a function of x. If we wish, we can now 
find the derivative of this function. 
dy — 1 
dx 2+4cosy 
If we were asked to find f’(271) then, since we know f(271) = 7, we 
could answer 
1 1 


7 
2 — — = 
Neh) 2+cosm 2-1 


1. 





Implicit differentiation was not strictly necessary here, since we 
could have expressed x as a function of y, found dx/dy, and 
inverted this to get dy/dx. Our next example is one in which 
there is no option other than implicit differentiation. 


Example 10 
> Let x +cos x = y+ e”. The graph of this relation passes through 
the origin. What is its slope there? Check your result numerically 
with small values of x and y. 
> We differentiate implicitly. 

dx — sinx dx = dy + e’ dy 

dy 1-sinx 

dx 1+e 
Plugging in x = 0 and y = 0 gives dy/dx = 1/2. 
To check this result, we use the approximation (y — 0)/(x — 0) = 
dy/dx, which should be valid for small values of x and y. Let’s 
use x = 0.010 and y = 0.005, which are small and have y/x = 
1/2, as they approximately should according to the result of our 
implicit differentiation. If we didn’t make a mistake in our calculus, 
then these values of x and y should be nearly, but not exactly, 
solutions of the original equation that defined the relation between 
the variables. Plugging in, we have 


2 
X+cosx = y+er 


1.00995 =~ 1.01001 


These are indeed nearly equal, but in fact they were guaranteed 
to be nearly equal simply because (x, y) was close to the origin, 
and we knew that the origin was a point on the graph. What we 
need to check is that the discrepancy between the two sides is 
small compared to x and y themselves; if y = (1/2)x is the best 
linear approximation to the graph near the origin, then the error 
should be on the order of the squares of the variables, i.e., some- 
thing like 10~*. Subtracting, we find that the difference between 
the two sides of the equation is about 6 x 10~5, which is indeed 
small enough to confirm the result of the implicit differentiation. 
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Implicit differentiation applied to Watt's linkage Example 11 
As remarked on p. 161, there is a strong practical motivation for 
finding the slope of the curve 


(x? + y?)? = 2(x? — y?) (12) 


where it passes through the origin. Applying implicit differentia- 
tion, we have 


2(x? + y?)(2x dx +2y dy) = 2(2x dx — 2y dy) 
(1 — x? — y?)x dx = (14+x7 + y?)y dy 
dy i= x= yx 


dx (14x24 y2)y 





Directly plugging in x = 0 and y = 0 doesn’t work, since this gives 
0/0, which is an indeterminate form (ch. 6). For small values of x 
and y, the squares x* and y* become negligible compared to 1, 
and dy/dx = y/x, so this becomes 








Therefore this curve has a slope of +1 on its two segments cross- 
ing the origin. To make Watt’s linkage (with arms in the propor- 
tions previously described) constrain its central point to nearly 
vertical motion, we need to rotate it by 45 degrees. 





(x2 +y2)2=2(x2-y2) 


Watt's linkage, 
rotated 45 degrees | -"——. 
fa. 





m/ Example 11. 
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Problem a1. 


Nh 





Problem a3. 
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Problems 


al Figure n/1 shows a thin stick being compressed between a 
person’s hands. If the force is greater than a certain amount, the 
stick will start to bow. Figure n/2 is similar, but at the bottom 
the stick is constrained so that it can’t rotate; that is, its tangent is 
kept vertical. The stick is stronger in this situation, and more force 
is required before it will start to deform. The ratio of the two forces 
can be shown to be (2/7), where x is the smallest positive solution 
of the equation 
tanxz = x. 


Inspection of a graph of the tangent function shows that the value 
of x is approximately 4.5. Use Newton’s method to improve this 
approximation to six decimal places. Vv 


a2 ~~‘ The British economist Robert Malthus (1766-1834) theorized 
that the human population would tend to grow exponentially with 
time, whereas the production of resources such as food would grow 
only linearly, due to factors such as technological improvements. Un- 
der these assumptions, the population would then inevitably become 
too great to be fed, resulting in an event now known as a Malthusian 
catastrophe, such as famine or genocide. As an example, suppose 
that the production of food in a certain country increases so that 
at time t > 0, agriculture can feed a population 2 + ¢ (in units of 
millions of people), while the population (in the same units) equals 
e'. A Malthusian catastrophe will then occur at a time t determined 
by 
Pape 


Use Newton’s method to determine ¢ to two decimal places. V 


a3 The cycloid, figure 0, was introduced briefly in example 8, 
p. 159. It is the shape traced out in space by a point on the rim of 
a rolling wheel (which in this problem we take to have radius 1). Its 
equation in Cartesian coordinates can be written as 


t= cos (1 —y)- Jy(2—y), 


which can’t be solved for y in terms of x (in the sense defined in sec- 
tion 9.3). Use Newton’s method to find the value of y corresponding 
to « = 1, expressing your answer to five decimal places. v 


cl A sugar cube dissolves in hot tea so that the edge of the 
cube decreases at a rate r = dé/dt. (a) How fast is the volume V 
of the cube changing when the edge has length ¢? (b) Check that 
your answer has units that make sense. (c) Evaluate your answer 
numerically for 2 = 5.0 mm and r = —0.3 mm/s (millimeters per 
second). 
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c2 (a) A conical water tank with vertex down has height h, and 

radius a at the top. The water is being drained out at a rate of 

flow F = dV/dt. How fast is the depth d of the water decreasing, 

when d has a certain value? (b) Check that your answer to part a 

has units that make sense. (c) Evaluate your answer numerically 

for a = 12m, h = 30 m, d= 20 m, and dV/ dt = —1.4 x 10-? m3/s. 
Vv 


c3 The photo shows a common geological formation called talus. 
Erosion causes rock and sand to be washed down the gullies, where 
over geological time this debris piles up higher and higher against 
the vertical cliff. Suppose that the pile is in the shape of half a cone, 
and that its volume grows at a rate R = dV/dt. The cone’s slope a 
is fixed by the maximum steepness for which friction is capable of 
keeping a rock from sliding down. (a) Find the rate dh/ dt at which 
the height of the cone grows, in terms of R, a, and h. (b) Check 
that your answer to part a has units that make sense. (c) Check the 
dependence of your answer on the variable R. That means that you 
should determine physically whether increasing R should increase 





the result or decrease it, and then compare this to the mathematical Huge talus cones on the coast of 
behavior of your equation. (d) Do the same for the variables a and Svalbard, problem c3. 

h. Vv 

c4 During chemotherapy, the volume of a spherical tumor de- 


creases at a rate that is proportional to its surface area. Show that 
its radius decreases at a constant rate. 


In problems e1-e9, evaluate the differentials. 





el d(B°?) Solution, p. 239 
e2 d(2000BC) Solution, p. 239 
e3 d(sin k) Solution, p. 239 
e4 = d(pb+ 3) Solution, p. 239 
e5 d(e”) v 
e6 d(uck) v 
e7 d(e”y) v 
e8 d (4) v 
e9 d(r?) (differential of the area of a circle) v 
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In problems g1-g4, a function y(x) is defined explicitly. Find an 
implicit definition that does not involve taking roots. Then use this 
description to find the derivative dy/ dx in terms of x. 


gl y=v2r24+1 > Solution, p. 239 


g2 y=Vl-2 v 


g3 y= Vr4+2? Vv 


In each of the problems i1-i4, an implicit relation is defined between 
x andy, and the graph passes through the origin. Find the slope of 
the graph at the origin. 





il re t¥ +y=0 > Solution, p. 240 
i2 sin x — ycos(xy) = 0 Vv 
i3 (2 + 2y — 1)? + (42 —y— 1)? =0 v 
i4 sin (ye) + e™°¥ —1=0 v 
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k1 An astroid is the shape traced by a point on a circle of radius 
a/4 as it rolls around the inside of a circle of radius a. Its equation 
is 
2/3 4. 2/3 — 92/3, 

(a) Check that the units of the equation make sense. (b) Use implicit 
differentiation to find an extremely simple expression for dy/ da in 
terms of y and x. (Do not eliminate y in favor of x, because that 
makes the expression more complicated.) (c) Check the units of 
your result. (d) Check that the sign of your result is correct in all 
four quadrants of the graph. (e) The notion of a cusp was briefly 
introduced on p. 61; it is a horn-shaped point on a graph where 
the two branches are parallel when they meet at the tip. From the 
figure, it’s hard to tell whether the astroid has cusps or whether 
there is a nonzero angle between the branches. Use your result to 
determine which is the case. 


k2 The figure shows a fountain in Sergel’s Square, Stockholm, 
named after the sculptor Sergel. The fountain was designed by 
architect David Helldén using a mathematical shape suggested by 
his friend, the Danish mathematician, poet, designer, and author 
Piet Hein. The equation of the shape is 


\x|5/2 ete |y|°/? ES q?!?. 


where a is a constant. (a) Find the units of a. (b) Use implicit 
differentiation to find an extremely simple expression for dy/ da in 
terms of y and xz. For simplicity, you can restrict your result to 
the first quadrant. (Do not eliminate y in favor of x, because that 
makes the expression more complicated.) (c) Check that the units 
of your result make sense. (d) Check that the sign of your result 
makes sense. (e) Check that the result makes sense where the curve 
intersects the positive x and y axes. 


k3 Evaluate d(z¥), and show that you can recover the correct 
results in the special cases where x or y is constant. Hint: rewrite 
the expression in terms of the exponential function. 


z1 Newton’s method fails in some cases. As an example, suppose 
we have f(a) = |a|!/*, we want to find an x such that f(x) = 0, and 
we start with x9 = 1 as our initial guess. Of course this is a silly 
application, since it’s obvious that the solution is x = 0, but the 
point is to study a simple example where the method fails. Find a 
formula for |v, — %n—1| in this example. Then use this result in a 
proof by induction to show that Newton’s method fails. 


An astroid, problem k1. 


Problem k2. 


Problems 
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Chapter 8 
The integral 


8.1 The accumulation of change 


8.1.1 Change that accumulates in discrete steps 
A schoolboy plays a trick 


Toward the end of the eighteenth century, a German elementary 
school teacher decided to keep his pupils busy by assigning them a 
long, boring arithmetic problem: to add up all the numbers from 
one to a hundred.! The children set to work on their slates, and the 
teacher lit his pipe, confident of a long break. But almost imme- 
diately, a boy named Carl Friedrich Gauss brought up his answer: 
5,050. 


Figure a suggests one way of solving this type of problem. The 
filled-in columns of the graph represent the numbers from 1 to 7, 
and adding them up means finding the area of the shaded region. 
Roughly half the square is shaded in, so if we want only an approx- 
imate solution, we can simply calculate 77/2 = 24.5. 


But, as suggested in figure b, it’s not much more work to get 
an exact result. There are seven sawteeth sticking out out above 
the diagonal, with a total area of 7/2, so the total shaded area is 
(7? + 7)/2 = 28. In general, the sum of the first n numbers will be 
(n? + n)/2, which explains Gauss’s result: (1007 + 100) /2 = 5,050. 


There is a tantalizing hint here of a link with differential calculus, 
because the derivative of a real function f(n) = (n?+n)/2 is almost, 
but not quite, equal to n. 


Accumulation of change in discrete steps 


Problems like this come up frequently. Imagine that each house- 
hold in a certain small town sends a total of one ton of garbage to 
the dump every year. Over time, the garbage accumulates in the 
dump, taking up more and more space. If the population is constant, 
then garbage accumulates at a constant rate. But maybe the town’s 
population is growing. If the population starts out as 1 household 
in year 1, and then grows to 2 in year 2, and so on, then we have 
the same kind of problem that the young Gauss solved. After 100 
years, the accumulated amount of garbage will be 5,050 tons. The 





‘Pm giving my own retelling of a hoary legend. We don’t really know the 
exact problem, just that it was supposed to have been something of this flavor. 








a/Adding the numbers from 
1 to 7. 





b / A trick for finding the sum. 


NU" 


c/Carl Friedrich Gauss (1777- 
1855), a long time after gradu- 
ating from elementary school. 
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d/Bernhard Riemann 
1866). 


(1826- 


pile of refuse grows more quickly every year. 


Sigma notation 


There is a convenient way of notating sums like the ones we’ve 
been doing, which involves %, called “sigma,” the capital Greek 
letter “S.” Here the “S” stands for “sum.” The sigma notation 


looks like this: 
100 


S > i= 5,050 (1) 
i=1 
This is read as “the sum of 7 for 7 from 1 to 100 equals 5,050.” The 
version without the sigma notation is much more cumbersome to 
write: 
14+24+3+4...+ 100 = 5,050 (2) 


In equation (1), 7 is a dummy variable. We could have written 


100 


Sj = 5,050 


j=l 


and it would have meant exactly the same thing. We’ve already 
seen some examples of dummy variables. In set notation (box 1.1, 


p. 15), 
S={azlz7>0} and T= {yly? > 0} 


describe exactly the same set, and S=T. Similarly, the function f 
defined by f(u) = u? and the function g defined by g(v) = v? are 
the same function, f = g. 


8.1.2 The area under a graph 


The examples in section 8.1.1 involved change that occurred in 
discrete steps. Calculus is concerned with continuous change. The 
continuous analog of a discrete sum is the area under a graph. Let 
f be a function that is defined on an interval? [a,b] and assume the 
value of f is always positive (so that its graph lies above the x axis). 
How large is the area of the region caught between the x axis, the 
graph of y = f(x) and the vertical lines y = a and y = b? 





e/1. The area under the graph of the function f. 2. Approximating 
this area using 20 thin rectangles. 





?For interval notation, see p. 15. 
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8.1.3 Approximation using a Riemann sum 


We can try to compute this area, figure el, by approximating the 
region with many thin rectangles, e2. The idea is that even though 
we don’t know how to compute the area of a region bounded by 
arbitrary curves, we do know how to find the area of one or more 
rectangles. In this example, we’ve subdivided the interval from a 
to 6 into n = 20 equal subintervals, each of width Ax = (b—a)/n. 
Let’s write x, for the x value that lies in the center of the first 
subinterval, etc. We’ve chosen the height of each rectangle so that 
its top intersects the graph at this midpoint, so that, e.g., the height 
of the first rectangle is f(x1). The area of the k*® rectangle is the 
product of its height and width, which is f(a,)Aa. Adding up all 
the rectangles’ areas yields 


n 


R= 5 (height)(width) = S~ f(x.) Ac. (3) 
k=1 k=1 


This is an example of what is called a Riemann sum, meaning an 
approximation to the area under a curve using rectangles. This 
particular type of Riemann sum is one in which (a) the interval is 
subdivided into equal parts, and (b) the value of the function is 
sampled at the center of each subinterval. 


If f is negative in certain places, then we will hit certain values of 
k for which the product f(x,)Azx is negative. We will simply define 
areas below the x axis to be negative. We think of the rectangle 
as having positive width Az but negative height f(z;,). A similar 
geometrical example is the use of negative numbers for angles that 
are directed contrary to a standard direction of rotation. 


If our rectangles are all sufficiently narrow then we expect the 
total area of all the rectangles to be a good approximation of the 
area of the region under the graph. 


8.2 The definite integral 


8.2.1 Definition of the integral of a continuous function 
This suggests the following definition. 


Definition of the integral of a continuous function 
If f is a continuous function defined on an interval |a, b], then the 
integral of f(z) from x = a to b is defined as 


lim R, 
Az—>0 


where R is the type of Riemann sum defined above, using equal 
subintervals sampled at their centers. 
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f/ Three Riemann sums for the same function on the same _inter- 
val. As Ax approaches zero, the total area approaches the Riemann 
integral. 


Finding the integral of a function referred to as integrating it. 
The idea behind the words is that one meaning of “integrate” in 
ordinary speech is to assemble a whole out of smaller parts. For 
example, you could integrate sit-ups into your routine at the gym. 


Up until now we’ve been doing differential calculus. The other 
half of calculus, integral calculus, consists of the study of integrals. 
The type of integral defined here is called a definite integral. We’ll 
see later that there is another type, called the indefinite integral. 


This definition is restricted to continuous functions. A more 
general definition is given in section 8.6.2, p. 192. 





g / Example 1. 





‘A triangle Example 1 
Let f(x) = x. Then the integral of f from 0 to 1 represents the 
area of a triangle with height 1 and a base of width 1. We know 
from elementary geometry that this shape has an area equal to 
3(base)(height) = 4, so we don’t need integral calculus to deter- 
mine it. But let’s see how this works out if we do it as an integral, 
in order to get comfortable with the tool and see if it works in a 
case where we already know the answer. 


When we split up the interval [0, 1] into n parts, we have Ax = 1/n. 
The first subinterval is [0, Ax], and its center is the first sample 
point, x; = (1/2)Ax. Continuing in this way, we have x, = (k — 
1 /2)Ax, for k running from 1 to n. Since our function is just f(x) = 
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x, we also have f(x,) = (kK — 1/2)Ax. The Riemann sum R is 
shown in figure g. It looks almost exactly like the staircase in a 
on p. 173. There are two differences: (1) in the original staircase 
problem, the graph covered a region of graph paper n squares 
wide and n squares tall, whereas the graph of our Riemann sum 
is scaled down so that it fits inside a single square with a width of 
1 and a height of 1; (2) all of the steps have been lowered by half 
a step. 


When we evaluate the Riemann sum, we find that the fates have 
been kind to us, and its value in this example always seems to 
be 1/2, for every n. For example, with n = 3 the Riemann sum is 


1 1 5 _ 9 eel 
gAXx + 5AX + ZAX = gAXx = 5 








To see that this is always true in this example, let’s go ahead and 
compute the Riemann sum for an arbitrary n. 








The sum over k is the same one that we encountered in our pre- 
vious study of the “staircase” sum; it equals (n? +n) /2. The result 


IS: 
R= (axe { [73 "| - 5} 


2 
a (ane 





But Ax = 1/n, so R = 1/2 exactly for every n, and the integral 
equals 


{ 
lim R= = 
im 5° 


n-oo 


as expected geometrically. 
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8.2.2 Leibniz notation 


If we take equation (3) that defines the Riemann sum R, and 
substitute it into the definition of the integral lima, _.9 R, the result 
looks like this: 


Ar 


lim f (ap) Ax 
k=1 


Leibniz invented the following expressive, versatile, and useful no- 


tation for this limit: ; 
[ tee 


The symbol f{ is an “S” that’s been stretched like taffy. It stands 
for “sum,” just as the sigma, J, stands for “sum.” But we think of 
J as meaning a smooth sum, whose graphical representation is the 
area under a smooth curve rather than under a staircase. Notice 
how the shape of f is smooth. Like the k in the sigma notation, 
the x in this example is a dummy variable. Therefore ie f(a) da 
means exactly the same thing as fe f(s) ds. The dummy variable 
inside an integral is referred to as a variable of integration, and has 
no meaning outside the integral. One of the reasons for writing the 
dz is that it states what we’re integrating with respect to. 


Leibniz notation for the area of a triangle Example 2 
In example 1, we integrated the function f(x) = x from x = 0 to 1, 
and found that it was 1/2. In Leibniz notation, the result is written 


like this: 
: 1 
xX dx=-= 
[ x%=5 


It makes no difference if we notate this instead with s as the vari- 


able of integration: 
1 
{ 
| sds=-—z 
0 2 


A rectangle Example 3 


> Evaluate , 
i 1 dx. 
0 


> The graph of this function is a rectangle with height 1 and width 
4. A rectangle is a shape that can be sliced up into thin, vertical 
slices that are also rectangles, and this is what any Riemann-sum 
approximation to this integral will look like. The approximations 
aren't really approximations at all. Every Riemann sum has an 
area of 4, so the limit occurring in the definition of the integral is 4. 
This is of course the correct result for the area of this rectangle. 


We defined the Leibniz notation as simply a notation for a cer- 
tain limit, but we can think of it conceptually as a sum with infinitely 
many terms. That is, we make a Riemann sum with infinitely many 
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rectangles. Normally if you added up an infinite number of things, 
you would expect to get an infinite result. But remember, each of 
these rectangles is infinitely skinny. We think of dx as being the 
infinitely small width, so that the area f(x) dz is infinitely small. 
We’re therefore adding an infinite number of things, each of which 
is infinitely small, so that the result can be finite. Recall that, as 
discussed in section 2.9, p. 64, the real number system doesn’t have 
infinitely big or infinitely small numbers; however, if we handle our 
infinities according to the simple rules given in that section, nothing 
bad happens. Historically, these rules weren’t formalized, and prac- 
titioners just knew that if they did their work according to certain 
methods, the Leibniz notation never led to the wrong result. This 
confusion was definitively cleared up around 1965, but many math- 
ematicians have been influenced by the historical uneasiness about 
the Leibniz notation, so they prefer to think of {...dx purely as a 
shorthand notation for a limit. This is a matter of taste. Those who 
prefer to think of it only as a shorthand will consider the dz inside 
the integral to be nothing more than punctuation, like the period 
at the end of a sentence. From this point of view, its only job is to 
tell us what the dummy variable is, i.e., what we’re integrating with 
respect to. 


Moving the dx around Example 4 
One of the rules in section 2.9 was that we were allowed to manip- 
ulate differentials such as dx using any of the elementary axioms 
of the real numbers (section 1.6, p. 25). One of these axioms 
is that multiplication is commutative, uv = vu. Therefore the inte- 
gral in example 2 on p. 178 can be written in either of the following 
equivalent ways: 


1 { 1 { 
Xdx=-x [ axx=5 
[ 2 0 2 


Similarly, all of the following are the same integral: 


2 2 2 
[ ya [ oxs- ols 
1 xX { x 1 Xx 


Most people would write it with the dx on top, which makes it more 
compact. 


The integral of ... what? Example 5 
How should we interpret this expression? 


4 
[a 
0 


There doesn’t seem to be any function written inside the integral, 
so what is it that we’re integrating? One of the elementary axioms 
of the real numbers (section 1.6, p. 25) is that 1 is the multiplica- 
tive identity, i.e., 1u = ufor any number u. As discussed in section 
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2.9, the elementary axioms also apply to differentials. Therefore 
it’s valid to rewrite our integral as follows. 


4 
[ta 
0 


The function we're integrating is 1, which makes this the same 
integral as the one in example 3 on p. 178. The result is 4. 


Another way of interpreting the original form of the integral is that 
dx means “a little bit of x,” so that the integral expresses the idea 
of letting x change from 0 to 4, and adding up all the little changes 
in x. Clearly the sum of all the little changes will be the total 
change, which is 4. 


Another nice feature of the Leibniz notation is that it makes 
the units come out right. Consider our earlier example of the town 
dump. Suppose that the rate of garbage production is given by a 
function p(t), where ¢ is in units of years and p in tons per year. 
Then the amount of garbage accumulated at the town dump from 


year a to year b is given by 
b 
i p(t) dt. 
a 


The integral sign [ is a kind of sum, and the units of a sum are the 
same as the units of each term. Since d means “a little bit of ...,” 
dt stands for a little bit of time, and it therefore also has units of 
years. The units of the terms in the sum are 


tons 
—— X years = tons, 
year 


which makes sense. 


We can now see three independent reasons why an integral such 
as is x dx can’t be written like iis x, without the dz: 


1. If x has units, then the expression without the da has the 
wrong units. 


2. It would be a sum of infinitely many numbers, each of them 
finite, so it would probably be infinite. 


3. If we don’t write the dx, we haven’t stated what we’re inte- 
grating with respect to. 
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8.3 The fundamental theorem of calculus 


8.3.1 A connection between the derivative and the integral 


We’ve already seen some clear indications of a link between 
derivatives and integrals. A derivative is a rate of change, and an 
integral measures the accumulation of change. Let’s say for con- 
creteness that we’re talking about functions of time. If a function 
A tells us the rate at which function B changes, then B tells us 
how the rate of change measured by A has accumulated over time. 
That is, it seems clear conceptually that the integral and the deriva- 
tive are inverse operations: operations that undo each other, in the 
same way that subtraction undoes addition, or a square root undoes 
a square. 


Figure h shows this in the context of discrete rather than con- 
tinuous functions. Column A shows how many tons of garbage are 
sent to the town dump per year. It is the rate of change of the pile 
at the dump, which is given in column B. The population is grow- 
ing, so column A is not constant. Presumably one of these columns 
was typed into the spreadsheet from data collected by the town, but 
we can’t tell from looking at the spreadsheet which one it was. It’s 
possible that the raw data was column A, in which case column B 
would have been constructed by telling the spreadsheet software to 
calculate a running sum based on A. The running sum of a discrete 
function is conceptually similar to the integral of a continuous one, 
sO we can say that in some loose sense that B is the integral of A. 
On the other hand, it’s possible that the raw data was column B: a 
municipal employee has been going out to the dump at yearly inter- 
vals and measuring how big the pile of trash was. Column A would 
then have been calculated from B by taking differences of successive 
years. This is conceptually similar to saying that A is the derivative 
of B. 


8.3.2 What the fundamental theorem says 


The fundamental theorem of calculus 
Let f be a function defined on the interval [a,b], and let f be 
differentiable on that interval. Then 


b 
f Shae = f(0)- Fla), (4) 
On the left-hand side, we have taken a function, differentiated 
it, and then integrated it. The right-hand side is a simple expression 
involving the original function, i.e., in some sense the integration has 
undone the differentiation, and we are left with the same function 
we started with. 


To see why the right-hand side contains a difference of two values 
of f, consider figure i, which is a modified version of h. What’s 
changed is that rather than starting out empty in the first year, 























A B 

1 |garbage per year TIT garbage 
2 0 
3 1 1 
4 | 2 3 
5 3 6 
6 4 10 
7 5 15 
8 6 rail 
9 7 28 
10 








h/Columns A and B in the 
spreadsheet relate to each other 
approximately as derivative and 
integral. 


A 
“1 garbage per year accumulated garbage 
1000 
1003 
1006 
1010 
1015 
1021 
1028 


Nie 








a) 
NOOB W 








i/ The initial amount of garbage 
is 1000 tons rather than zero. 
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Fred owns two cute terriers 
and an overweight cat. 


Y 
SHAME 
SAM —T i SR 
Y 


Fred has two cute little terrier 
and an overweight cat. 


j/After translation by a com- 
puter from English to Chinese, 
and then back to English, the 
original sentence is not quite the 
same. By analogy, the funda- 
mental theorem tells us that if 
we differentiate, then integrate, 
we cannot quite recover the 
original function: we lose any 
information that amounts to an 
over-all additive constant. 


in this version of history the dump started out with 1000 tons of 
garbage already in it. This alteration of column B, however, has no 
effect on column A. For example, the subtraction 1015 — 1010 gives 
the same result as 15 — 10. The fundamental theorem tells us that 
we can make a “round trip” by computing column A from column B 
using differences, and then reconstructing column B again by taking 
a running sum. But the round trip isn’t perfect (cf. figure j). Some 
information is lost, because given column A, we can’t tell whether 
the version of column B we should reconstruct is the one in figure h, 
the one in i, or some other version that differs from them by some 
other additive constant. What we can tell is that the difference 
between the initial and final cells of column B must have been 28, 
which is the sum of column A. 


In terms of continuous functions rather than discrete ones, adding 
a constant onto f doesn’t change the derivative df/da. Therefore 
the left-hand side of the fundamental theorem can never tell us the 
value of f but only the difference in values between x = a and x = b. 


8.3.3 A pseudo-proof 


We’ve seen examples before in which the Leibniz notation makes 
certain facts about calculus seem so obvious that they don’t seem 
to need any further proof. This happens, for example, if we rewrite 
the chain rule as dz/ dx = (dz/dy)(dy/ dx), which makes it seem 
like a simple fact about algebra; but this is not quite a rigorous 
proof for the reasons explained in example 18, p. 66. It’s a “pseudo- 
proof,” but that’s not necessarily a bad thing. Pseudo-proofs can 
be good. The pseudo-proof helps us to understand why the result 
makes sense, and it can, if we wish, serve as the backbone of a more 
rigorous proof. 


We will give a real proof of the fundamental theorem in section 
8.6.3, p. 194, but let’s warm up with the pseudo-proof, which is 
pretty simple. We start with a statement of the result, 


f(b) — f(a), (5) 


with the question mark above the equals sign to show that this is 
what we are hoping to prove. For the same reasons as in example 
18 on p. 66, it is not quite valid to cancel the factors of dz, but we'll 
do it anyway because this is only meant to be a pseudo-proof. 


f(b) 5 
[att ro -t@) (6) 
a) 

We can interpret the symbol df as “a little bit of f,” so that the 
left-hand side is the sum of many very small changes in f. The 
limits of integration are now stated in terms of the values of f, since 
f is now the variable of integration, not x. (It’s true, but not as 
obvious, that this is equally valid regardless of whether f is always 
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increasing or always decreasing. If f goes up and then comes back 
down, we could, for example, have f(a) = f(b), so that the upper 
and lower limits of integration were the same.) 


It’s clearly reasonable now to hope that we can make the left- 
hand side of equation (6) equal the right. The left-hand side says 
that we add up many small changes in the variable f. The right- 
hand side is simply the total accumulated change in f. To see this 
a little more explicitly, let’s insert a factor of 1 inside the integral, 
as in example 5, p. 179. 


f(b) 
| L-df = f(b) — f(a) (7) 
f(a) 


As in that example, this integral represents the area of a rectangle. 
The rectangle has width f(b) — f(a) and height 1, so its area is 
f(b) — f(a), and the equation holds. 


This pseudo-proof is refined into a real proof in section 8.6.3, 
p. 194, 


8.3.4 Using the fundamental theorem to integrate; the 
indefinite integral 


Avoiding the Riemann sum 


The fundamental theorem says this: 


b 
[ f@er=s0)- Fo, 


In some examples, this gives us a tricky way to evaluate an inte- 
gral exactly without having to muck around with Riemann sums. 
Consider the integral 
1 
i, x dx, 
0 


whose geometrical interpretation is the area of a triangle and whose 
value we showed to be 1/2 using Riemann sums in example 1, p. 176. 
The function we’re integrating is x, but what if we could find a 
function f whose derivative was 7? — 


fi(e)=s 


The fundamental theorem would then immediately tell us the result 
of the integral. 


Antiderivatives 


The function f is called an antiderivative of the function f’. Al- 
though there are various tricks and methods for finding antideriva- 
tives, in general the only way to find them is to guess and check. 
One way to approach this one is to think of x as x!. We know that 
when we differentiate a power, the power rule tells us to knock down 


Section 8.3. The fundamental theorem of calculus 


183 





k/All three functions are an- 
tiderivatives of the constant 
function 1/7. Shifting the graph 
vertically doesn’t change its 
derivative. 


the exponent by one. That makes it reasonable to guess something 
like x? as an antiderivative of x. Checking our guess, we find that 
it was almost, but not quite, right: 


f(iz)=2? => f'(a) =2¢ [not quite what we wanted] 


We wanted the derivative to be x, but we got 2x. This is easily fixed 
by halving our guess: 


fa)= 50 = sMe)=2 


2 


The function 5r is an antiderivative of x. Therefore by the funda- 


mental theorem we have 


This is the same result that we obtained earlier and with much more 
labor using Riemann sums. 


Because antiderivatives are so frequently used in order to eval- 
uate definite integrals, expressions of the form f(b) — f(a) are very 
common, and various abbreviations have been invented. We will 
abbreviate 


f(b) — f(a) © f(a)|?_, = Fa)”. 


Any time we have an antiderivative, we can produce other an- 
tiderivatives by adding a constant. For example, all of the following 
are antiderivatives of the constant function 1/7 with respect to x: 





Differentiating any one of these with respect to x gives 1/7. 


Leibniz notation for the indefinite integral 


An antiderivative is more commonly referred to as an indefinite 
integral — as opposed to the kind of integral we’ve been talking 
about up until now, which is called a definite integral. The Leibniz 
notation for an indefinite integral is an integral sign without any 
upper or lower limits of integration. For example, 


1 
fe dx = 5a" +e, 


where c is any constant. One way of understanding this notation is 
that both sides of this equals sign represent a certain solution set — 
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the set of all functions whose derivative equals x. Similarly, when 
we write 





4=> +42, 
we could say that both sides of the equation represent the solution 
set {—2,2} of the equation x? = 4. 


The following table summarizes the differences between definite 
and indefinite integrals. 


indefinite integral 


definite integral 





f f(a)dzx is a function of x. 


f f(«)dz is a number. 





By definition f f(x) dz is any 


f° f(a)dzx is defined in terms of 


function of x whose derivative 


is f(z). 


Riemann sums and can be in- 
terpreted as the area under the 


graph of y = f(z). 





The variable of integration is a 
dummy variable. For example, 
fo 2x de = 1, and Jo 2t dt =1, 
sO ti 2x dx = fe 2t dt. 


The variable of integration is 
not a dummy variable. For ex- 
ample, f 2x dx = x* + and 
f 2t dt = t? +c are expressed in 
terms of different variables, so 
they are not the same. 


Example 6 
> Evaluate 


ies dx 


> Differentiation of a power will reduce the exponent by one, so 
we want something like x’. The derivative of x’ would be 7x®, 
which is too big by a factor of 7, so we want x’/7. Including an 
arbitrary constant of integration, we have 


' 
ee dx = 5x’ +0. 


Integral of 1/x 
> Evaluate the indefinite integral 


dx 

ice 
> As discussed in example 4, p. 179, this notation says that the 
function being integrated is 1/x, or x-'. Normally if we wanted 
to find the antiderivative of x to some power, we would increase 
the exponent by 1, as in example 6. But the derivative of x° is 
simply zero, so that doesn’t work here. We recall that the ladder of 


powers is interrupted at this place, figure |. The indefinite integral 
we want is 


Example 7 


Inx +c. 


0 differentiation 


0 integration 


| / Differentiation moves us 
down the ladder of powers of x. 
Integration climbs the ladder, as 
in example 6. Example 7 deals 
with the break in the middle of the 
ladder. 
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Area under the graph of 1/x Example 8 
> Interpret the definite integral 


“dx 

1 Xx) 
graphically; then evaluate it . 
> Figure m shows the graphical interpretation. 


We saw in example 7 that the integral of 1/x was Inx +c. Us- 
ing the fundamental theorem of calculus, the area is (In2 + c) — 
(In1 +c) + 0.693147180559945. Note that the constant of inte- 
gration cancels out when we plug in the upper and lower limits 
of integration and subtract; this always happens when we evalu- 
ate a definite integral in this way, so constants of integration are 
irrelevant in this context, and usually we would skip writing the +c. 


Judging from the graph, it looks plausible that the shaded area is 
about 0.7. 


rx 


8.4 Using the tool correctly 


8.4.1 When do you need an integral? 


In section 1.5.2, p. 23, we asked the question, “When do you 
need a derivative?” It’s natural to ask the same question about in- 
tegrals. And since the derivative and integral are so closely linked 
by the fundamental theorem of calculus, the answers should be re- 
lated. If the relationship between two variables A and B is such that 
expressing A in terms of B requires a derivative, then expressing B 
1 2 3 in terms of A also requires calculus — it requires an integral. 








As a concrete example, let x be your car’s odometer reading, 
and let v be the reading on the speedometer. If v is constant, then 
we don’t need calculus to express it in terms of x. 


m/ Example 8. 


x 
Uae [only if v is constant] (8) 
But if v is changing, then equation (8) gives the wrong answer. We 


need calculus. 
a dx 


=e [always valid] (9) 


Now suppose we want x in terms of v. If v is constant, then we 
don’t need calculus. Simple algebraic manipulation of equation (8) 
gives 

Az =vAt. [only if v is constant] (10) 
But equation (10) clearly doesn’t make sense if v isn’t constant. If 
you're in stop-and-go traffic, then your velocity isn’t just one num- 
ber. What would it even mean, then, to “multiply v by At?” Mul- 
tiplication is like that special thing that happens when a mommy 
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and a daddy love each other very much; it’s something that hap- 
pens between just one number and one other number. Applying the 
fundamental theorem of calculus to equation (9), we get 


t2 
Ka ih v dt. [always valid] (11) 
ty 


We expect the integral to come up in applications as a generalization 
of multiplication that covers the case where one of the factors is 
varying. 








n/ Example 9. The tractor does mechanical work. 


Work Example 9 
> In each of the examples in figure n, the tractor exerts a force 
while traveling from position x; to position x2, a distance Ax = x2— 
X,. If the force F is constant, then the quantity W = FAx, called 
mechanical work, measures the amount of energy expended. If 
W is the same in all three cases in the figure, then the amount of 
gas the tractor burns is identical in all three cases. How should 
this definition of mechanical work be generalized to the case where 

the force is varying? 


> To generalize multiplication to a case where one of the factors 
isn’t constant, we use an integral. 


X2 
w= | F dx 
x 


8.4.2 Two trivial hangups 


In section 1.4, p. 21, we discussed two common difficulties that 
students encounter in applying differentiation to real-world prob- 
lems. The same two issues occur in integration. The first is that 
although a calculus textbooks will often notate every problem in 
terms of the letters y and x, any letters of the alphabet can occur 
in real-life applications. The second is that one often encounters 
symbolic constants, which are to be treated just like numerical con- 
stants. 
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A falling rock Example 10 
> A falling rock has a velocity that increases linearly as a function 
of time, v = at, where ais a constant. Use an indefinite integral 
to find the position as a function of time. 


> Let’s first figure out the roles played by the three letters: 


e t — the independent variable 

e v—a function of t 

e a—aconstant 

e x — the function we get as an indefinite integral 
Next, let's warm up by translating this into a more stereotypical 
problem from a calculus textbook. For example, we could be 


given the function y = 7x and asked to find its indefinite integral. 
The integral is f ydx = (7/2)x? +. 


The solution to the actual problem is found by simply shuffling 
letters of the alphabet and treating the constant a the same way 
we treated the constant 7. The setup of the integral is 


x= [vat 


12 
X = ~at"+C. 
9 + 


and the result is 


The constant of integration is interpreted as the initial position, so 
it’s actually nicer to give it a notation that indicates that: 


x= ere 
=5 5 


8.4.3 Two ways of checking an integral 

Every indefinite integral can be checked by taking its derivative 
to see if we can get back the original function. Furthermore, we can 
often check an integral by checking its units. 


Checking the falling rock Example 11 
Let’s use these techniques to check the result of example 10. We 
were given the function 


v = at. (12) 
We set up the integral as 
X= ’; v dt, (13) 
and the result was , 
X= pat” + Xo. (14) 
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First we take the derivative of both sides of equation (14). Be- 
cause f is the independent variable here, these are derivatives 


with respect to f. 
—=—(- . 15 

dt dt (5a + %0) ey 
The left-hand side is the definition of the velocity v. On the right- 
hand side, we have to differentiate a polynomial. The constant 
a is treated like any other multiplicative constant: it just “comes 
along for the ride” in differentiation. The constant Xo is treated like 
any other additive constant in differentiation: it goes away. 


@ oO figs 
vias (50) (16) 


The derivative of (1 /2)t? with respect to tis t, so we recover equa- 
tion 12, and our solution passes the check. 


Next we check the units. The units of the given equation (12) 
ought to be right. If we remember the units of acceleration, we 
can check its units. If we don’t remember the units of acceleration, 
we need to infer the units of the symbolic constant a from equation 
(12), because otherwise we won't be able to do the check on our 
own work. Based on equation (12), the units of acceleration are 
implied to be meters over seconds squared, m/s?. 


Our initial setup in equation (13) has the following units: 


X = v dt 
~“Y Se SS 
m m/s Ss 


The integral can be thought of as a sum, and the units of a sum 
are the same as the units of the things being added. This works 
out properly, so our setup passes this check as well. 


We finish by checking the units of our final result, equation (14). 


8.4.4 Do | differentiate this, or do | integrate it? 


In an end-of-chapter problem in a calculus textbook, you’re usu- 
ally commanded either to integrate or to differentiate. In real-world 
contexts, however, the question can arise of which one is the right 
thing to do. Often we have a pair of variables, and we know that one 
is the integral of the other, and one is the derivative of the other. 
But which one is which? Memorization would be the wrong way 
to approach this. The following is a list of possible ways of telling 
which is is which. 
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1. A derivative often represents a rate of change, an integral the 
accumulation of change. 


2. Real-world quantities usually have units, and only one way of 
setting up the calculus relationship causes the units to make 
sense. 


3. The integral often occurs as a generalization of multiplication, 
the derivative as a generalization of the slope of a line. 


A chemical reaction Example 12 
> Chemicals P and Q react to produce R. There is a reaction 
rate r and a concentration C of the product. Which would be the 
derivative of which, and which would be the integral of which? 


> A derivative represents a rate of change, so r = dC/dt. An 
integral represents the accumulation of change, so C = [{ r dt. 


An epidemic Example 13 
> During an epidemic, there is some number of people / who have 
the disease, and some number w of new cases per day being 
reported. How would the calculus relationships between these 
two variables be set up? 


> The variable / is unitless; it is just a count of the number of 
infected people. The variable w has units of cases per day, but 
“cases” is really a count, not a unit, so the units of w are really 
day~' (inverse days). Conceptually, it’s clear that these two quan- 
tities should be related as integral and derivative, and if we were 
unsure of which way around to write the relationship, the units 
would tell us. 


i d/ 
peas ~ dt 
day unitless 
days 
i = w dt 
“SY Se “SY 
unitless day~' days 


An example of the third method was given in example 9, p. 187, 
where the definition of mechanical work was generalized to cases 
where the force varies. 


8.5 Linearity 


The most important and basic properties of the derivative (p. 16) 
are that it adds, (f +g)’ = f’ +4’, and scales vertically, (cf)! = cf’, 
where c is a constant. When an operation has these properties, we 
say that it is linear. Since the indefinite integral is defined as the 
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antiderivative, it follows that the indefinite integral is also linear, 
fue + g(x)] dx = f #0) d+ f ole) dx 


[ef0o) de=e f (0) dx 


and by the fundamental theorem the same is true for the definite 
integral. 





Example 14 


> Evaluate the definite integral 


[ro +X) dx 
0 


and give a geometrical interpretation. 


> The linearity of the definite integral gives 





1 1 1 1 3 
[andes [tave [ x deat 5-5. 
0 0 0 2 2 o/Example 14. The total 


area is the area of the square 
base plus the area of the triangle 
on top. 


Figure 0 gives a geometrical interpretation. 
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8.6 Some technical points 


8.6.1. Riemann sums in general 


As a tree grows, its radius increases continuously. When a tree 
is cut down, as in figure p, we can see that the growth in each 
year is not the same. For example, in most of California, where the 
weather tends to be dry, a tree will usually show markedly increased 
growth in a wet year. In this example, it’s natural to think of the 
radius of the tree as an integral of the form f “dr. Of course it 
would be silly to try to explicitly calculate this integral, when we 
could simply measure the radius with a ruler! We don’t really need 
calculus here, but, as is often the case, calculus guides us in thinking 
about the concepts even when we aren’t going to use the techniques 
of calculus. If we were to approximate this integral using a Riemann 
sum, it would seem most natural to break the sum down into unequal 
intervals Ar. This is allowed by the definition of a Riemann sum, 
and the kind of Riemann sum that we defined on p. 175, with equal 
subintervals, was a more specific type. 





p/Each tree ring adds Ar to the radius of the tree. The Ar values 
are not all the same. 


A Riemann sum can also sample the value of the function at 
some other place than the center of each subinterval. The sample 
point can be at the left side, at the right, and it doesn’t even need to 
be chosen in a consistent way for all the subintervals of a particular 
Riemann sum. 


8.6.2 Integrating discontinuous functions 


The definition of the integral given in section 8.2.1, for contin- 
uous functions, has some technical shortcomings if we try to apply 
it to badly behaved discontinuous functions. Most people who use 
calculus neither know nor care about these issues, and it’s all right 
to skip this subsection on a first reading. 


To show what can go wrong, we define two functions, one naughty 
and the other even naughtier. 


e Let f(a) be defined as f(x) = 1/x, except at x = 0, where we 
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set f(0) = 0. 


e Let g(x) be the function such that if x is a rational number, 
g(x) = 0, but if x is irrational, g(x) = 1. 


The definition of the integral in section 8.2.1 involved Riemann 
sums using equal subintervals, sampled at their centers. It carried 
a warning label saying that it only applied to continuous functions. 
Let’s ignore the warning and see what goes wrong when we apply it 
to functions f and g. 


The function f is discontinuous at only one point, and the dis- 
continuity is one where it blows up to +oo on one side and —co 
on the other. If we evaluate ies f(x) dx using equal subintervals 
sampled at their centers, then because f is odd, every Riemann sum 
is exactly zero. The Riemann sums for odd n use x = 0 as a sample 
point, but these sums still vanish, because f(0) = 0. This integral, 
as defined in section 8.2.1, comes out to be zero. 


The function g is what’s known as a “pathological” example, 
meaning that it’s so weird that we don’t expect to encounter such 
a thing in any real-world application. For example, we could never 
determine a function like g from physical measurements, because 
measurements can’t distinguish a rational number from an irrational 
one. If we evaluate i g(x) dx using equal subintervals, sampled at 
their centers, then every sample point is a rational number, so the 
integral comes out to be zero according to the definition in section 
8.2.1. 


The worrisome thing about both of these examples is that they 
both gave zero, but zero is either misleading or wrong in both cases. 
The result for the integral of f depended on a perfect cancellation 
of very large negative and very large positive terms in each Rie- 
mann sum. As n grew, these terms grew without bound, but they 
still canceled. In any real-world application, it’s unlikely that this 
would happen. For example, if f represented the reading on a meter 
measuring the flow of water through a pipe (positive and negative 
indicating two different directions of flow), then the extreme positive 
and negative flows near x = 0 would have destroyed the meter! 


The zero result for g is even more morally wrong. There are in 
some sense more irrational numbers than rational ones, so if this 
integral were to have some value, then clearly it should be 1, not 0. 


What we would really like is to have our definition of the integral 
be stated in such a way that integrals like these come out to be 
undefined. This can be done by requiring in the definition that 
no matter what Riemann sum we use, regardless of whether the 
subintervals are equal or the sample points are at their centers, the 
limit must come out to be the same. 
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Definition of the integral (Riemann) 
Suppose we have a number J such that for every ¢ > 0, there exists 
a 0 > 0 such that |R —I| < € for every Riemann sum all of whose 
intervals have width x,41—2xz < 6, with any choice of sample points 
$1, ..-, Sn. Then J is the Riemann integral of the function. 


For the integrals of the functions f and g described above, there 
is no number J with the properties described in the definition. The 
integral is then undefined, as it should be. A function for which 
such an I does exist is called Riemann integrable. A sufficient con- 
dition for Riemann integrability is that the function has only finitely 
many points of discontinuity, and it doesn’t blow up at these discon- 
tinuities. For functions that are Riemann-integrable, the Riemann 
integral gives the same answer as the simpler definition in section 
8.2, p. 175. 


8.6.3 Proof of the fundamental theorem 


We now refine the pseudo-proof in section 8.3.3, p. 182, into a 
real proof of the fundamental theorem of calculus. We want to prove 
that 


b 
[ f@a=t0)-F@. (17) 


We assume that f’ is Riemann integrable, so that we have the free- 
dom to subdivide the interval [a,b] and choose the sample points in 
any way that is convenient. We will break up the interval [a, b] into 
n equal subintervals [x;,7i41], where i = 1, 2, ....—1. However, 
rather than restricting ourselves to sampling at the center of each 
subinterval, we apply the mean value theorem to each subinterval, 
and choose s; to be the point for which 





where Af; = f(xi41) — f(ai) and Ax = 2441 — 2;. This can be 
rearranged to give 


Afi = f'(si)Az. 
Adding these up, we have 


This tells us that by an appropriate choice of the sample points, 
we can make every Riemann sum, for every n produce the re- 
sult claimed by the fundamental theorem. It therefore follows that 
the limit that defines the integral has the value claimed by the 
theorem. 
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8.7 The definite integral as a function of its 
integration bounds 
8.7.1 A function defined by an integral 


Consider the expression 


ray V dt. 
0 


What does J depend on? To find out, we calculate the integral 

_ [1437 _ 1,3 _ 1793 _ 1,3 
So the integral depends on x. It does not depend on ¢, since ¢ is a 
“dummy variable.” 


In this way we can use integrals to define new functions. For 
instance, we could define 


(x) = [ di, 
0 


which would be a roundabout way of defining the function I(x) = 
x°/3. Again, since t is a dummy variable we can replace it by any 
other variable we like. Thus 


1@)= foe da 


defines the same function (namely, I(#) = 52°). 


pie ane 
y=ze™* 









This example does not really define a new function, in the sense 
that we already had a much simpler way of defining the same func- 
tion, by writing “I(2) = «°/3.” An example of a new function 
defined by an integral is the so called error function from statistics: 


area=erf(x) 


erf(x) = = [ e dt, (18) 


q/The definition of the error 


so that erf(x) is the area of the shaded region in figure q. 
function, erf(x). 


The integral in (18) cannot be computed as a formula.’ As 
described in more detail in section 10.1.2, p. 216, the integral in 
(18) occurs very often in statistics, so it has been given its own 
name, “erf(x)”. 


8.7.2 How do you differentiate a function defined by an 
integral? 


The answer is simple, for if f(a) = F’(a) then the fundamental 
theorem says that 


/ " g(t) dt = F(a) — F(a), 





3For more on what this means, see section 9.3, p. 209. 
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and therefore 


af fdt= 2 (F@)-F@) = F@) =f), 


d xa 
=f fo a=se@), 
A similar calculation gives 
d b 
=f £0 a =-F@. 


So what is the derivative of the error function? It is 


i d | 2 if 2 
See t 
erf ‘(x) Salles ed 
= 2 d se —t2 
et al 
2 


2 


= —_e 


Jn 


8.7.3. A second version of the fundamental theorem 


The way that we differentiated the erf function in section 8.7.2 
was an example of a more general idea, which can be considered as 
an alternative version of the fundamental theorem of calculus. The 
version of the fundamental theorem of calculus given in section 8.3, 
p. 181, says that if we differentiate and then integrate, we end up 
with the same function back again. This new second version says 
that something similar happens if we integrate and then differenti- 


ate: 


= ff at=F@ 
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Problems 


Problems al-a3 don’t require you to calculate anything. The point 
is to practice setting up and interpreting relationships between pairs 
of variables that are related as integral and derivative. 


al A barometric altimeter is a device that uses a measurement 
of air pressure P to determine altitude y. Let the density of air be 
p (Greek letter “rho,” the equivalent of Latin “r”), and the strength 
of the earth’s gravitaty g. If p is constant, then the difference in 
pressure between two heights is given by 


P,-— Pi = pgAy. 


Mountaineers and airplane pilots often traverse enough height that 
it is not a good approximation to take p as being constant; the air is 
less dense higher up. Use one of the methods of section 8.4.4, p. 189, 
to generalize the equation appropriately. > Solution, p. 240 


a2 Suppose that a business investment today will yield a stream 
of income in the future f(t), in units of dollars per year. The revenue 
starts today, at t = 0, and will end in the future at t = JT. The 
value of a dollar promised in the future is less than a dollar in hand 
today, because today’s dollar could be put in the bank and draw 
interest, growing in value exponentially as e”, where r is a constant 
that is proportional to the interest rate. Consider the following two 
proposed expressions for the present value V of the revenue stream, 
i.e., the amount that one should rationally be willing to pay today 
in order to receive it. 
d 


V= di (e-"7 f(t) 


Hi 
V=| f(t)e-™ dt 
0 
As described in section 8.4.4, p. 189, determine which of these is 
nonsense based on the units. > Solution, p. 240 


a3 An electric meter installed outside your household measures 
the flow of electric current J. If you turn on a lamp, J increases, and 
if you turn it back off again, J goes back down. The cost C' of the 
electricity is also a function of time; it grows until it’s time for the 
electric company to bill you. Consider the following two proposed 
relations between these variables. 


Here k is a constant. Use one of the methods of section 8.4.4, p. 189, 
to determine which of these makes sense. > Solution, p. 241 


Problems 


197 


198 


Problems c3-c6. 


cl (a) Compute S7}_, z- (b) Compute oe v 


m=1m* 
c2 (a) Which of the following are correct ways of notating the 
area of a right triangle with both legs of length 1? 


1 1 1 
| x / x dx | u du 
0 0 0 


(b) The function f is defined by f(x) = x? +1. Why is it wrong to 
notate the antiderivative of f as fax?+1 dx? pb Solution, p. 241 


In each of problems c3-c6, the goal is to approximate the area be- 
tween the graph and the x axis between x = 0 and x = 1, i.e., the 
value of i. f(x) da for the given function f. Each function was 
chosen such that for x € [0,1], we have y € [0,1] as well, so that the 
graph fits into a1 x1 square, as shown in the figure. These happen 
to be functions for which it is not possible to find an antiderivative, 
hence the need for an approximation. Divide the interval up into 5 
equal subintervals, sample the function at the center of each interval, 
and find the resulting Riemann sum. Maintain four decimal places 
of precision throughout the calculation so that you are left with three 
decimal places at the end that are not likely to be way off simply 
because of rounding. 


c3 (sin x) /x v 
c4—s e*! tan(ra/4) v 
c5 [cos(e)]? v 
c6 oe Vv 
te e* ltan(1x/4) Oe 
x 
x* 
[cos(e*)]* 
el Find three different functions of « whose derivatives with 
respect to x are all e”. > Solution, p. 241 
e2 One or more of the following antiderivatives is incorrect. 


As described in section 8.4.3, use differentiation to find which are 
incorrect. Fix any incorrect ones. 


1 
feowapere [er ara +e 


[otra aa +e [otar=o+e 


fe dz =e*+ec 





> Solution, p. 241 
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Evaluate the antiderivatives in problems e3-e14. If in doubt, guess 
and check as in problem e2. With experience it gets easier to guess 








correctly. 
e3 [ez +1) dz v 
e4 i (1 — 3t) dt v 
e5 [eras 11) du v 
| ae ee Ge 

v 
e6 [ (ite 9 +3] dx 
e7 oa [q > 0] v 

q 
e8 {= da v 
e9 / ea Balen o 
2 

e10 [svg dy Vv 
ell [cosy dy v 
e12 [cs 2r dr v 
e13 [sue — 1/3) dr v 
el4 [ine +sin2x) dx Vv 


Evaluate the antiderivatives in problems g1-g3. All letters other than 
the variable of integration are constants. 


gl i (Ax + B) da Vv 
e2 / bode “lee J 
23 i: coswT dt Vv 
gd / eft dt Jv 
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Problems k1-k4. 


In problems i1-i4, find the antiderivatives. All letters other than the 
variable of integration are constants. These problems can be done 
by first rewriting the given integrand in a form that you know how 
to integrate. 


il | VBeveds > Solution, p. 242 








i2 jar dz Vv 
2 
3 Bera J 
VP 
i4 [axe v 
we 


These instructions are for problems k1-k4. Each function f was 
chosen such that for x € [0,1], we have y € [0,1] as well, so that the 
graph fits into a1 x 1 square, as shown in the figure. 

(a) Make an eyeball estimate of the area under the curve. 

(b) As in problems c3-c6, divide the interval up into 5 equal subin- 
tervals, sample the function at the center of each interval, and find 
the resulting Riemann sum. Maintain four decimal places of preci- 
sion throughout the calculation so that you are left with three decimal 
places at the end that are not likely to be way off simply because of 
rounding. Your result should be roughly consistent with your esti- 
mate from part a, and you can also check it online. 

(c) Find the antiderivative [ f(x) dx, and check it online. 

(d) Evaluate the definite integral, & f(x) dx, check it against the 
approximations in parts a and b, and check it online. 


a Ree 
cos x aie es 
ial 


k1 f(x) =cosz v 
k2 f(z) =sing v 
k3 f(a) = 5c J 
k4 fe) = Vz v 
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Problems n1-n4 all involve calculating the work done by a force, as 
described in example 9, p. 187. These problems also require you to 
check the units of your result. To do that, you will need to know the 
following. The SI unit of force is the newton (N). Work has units 
of (force) x (distance), or N-m (newton-meters). 


nl The figure shows an archer drawing a longbow. When the 
string is pulled back to a distance x relative to its straight equi- 
librium position, the force required from the right hand is given 
approximately by F' = ka, where k is a constant. (a) Infer the units 
of k. (b) Find the amount of work done in pulling the bow from 
x =0 to x = b, where b is some number. (c) Check that the units 
of your result make sense. > Solution, p. 242 


n2 The figure shows the tension (force) of which a muscle is 
capable. The variable x is defined as the contraction of the muscle 
from its maximum length L, so that at « = 0 the muscle has length 
LD, and at « = L the muscle would theoretically have zero length. In 
reality, the muscle can only contract to x = cL, where c is less than 
1. When the muscle is extended to its maximum length, at x = 0, 
it is capable of the greatest tension, T,. As the muscle contracts, 
however, it becomes weaker. There is a nearly linear decrease, which 
would theoretically extrapolate to zero at x = L. (a) Infer the units 
of c and T,. (b) Find the maximum work the muscle can do in one 
contraction, in terms of c, L, and Ty. (c) Show that your answer to 
part b has the right units. (d) Show that your answer to part b has 
the right behavior when c = 0 and when c = 1. v 


n3-—s In July 1994, Comet Shoemaker-Levy 9, which had previously 
broken up into pieces, collided with the planet Jupiter. The figure 
shows discolorations left in the jovian atmosphere where the impacts 
had occurred. The diameter of each bruise is on the same order of 
magnitude as the size of the planet earth. These were hard hits. 
The energy came from the work done by the sun’s gravity on the 
comet as it fell inward from the Oort Cloud, a hypothesized outer 
region of the solar system. Let x be the comet’s position relative 
to the sun, and assume that the comet falls in from the negative x 
direction, i.e., from the side of the sun that we would visualize as 
the left-hand side of the number line. The force of the sun’s gravity 
on the comet is given by Newton’s law of gravity, F = GMm/z2?, 
where M is the mass of the sun, m is the mass of the comet, G is 
a universal constant, and the plus sign indicates that the force is to 
the right, i.e., toward the sun. 

(a) Infer the units of G. (b) Find the work done on the comet as 
it falls from x = —a to x = —b, where a is the distance from the 
sun to the Oort cloud, b is the distance from the sun to Jupiter, and 
both a and 6 are positive. (c) Check that the units of your answer 
to part b make sense. Vv 








Problem n1. 








Problem n2. 





Problem n3. 
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n4 See the instructions on p. 201. In a gasoline-burning car 
engine, the exploding air-gas mixture makes a force on the piston, 
and the force tapers off as the piston expands, allowing the gas to 
expand. A not-so-bad approximation is that the force is given by 
F = k/x, where x is the position of the piston. (a) Infer the units 
of k. (b) Find the work done on the piston as it travels from x = a 
to x = b. (c) Show that the result of part b can be reexpressed so 
that it depends only on the ratio b/a. This ratio is known as the 
compression ratio of the engine. (d) Check that the units of your 
result in part c make sense. Vv 


ql If a car on cruise control has the wrong speed at t = 0, it 
will take some time for the system to correct the error. The system 
may be designed to produce a velocity as a function of time given 
by 

v=u+be", 


where wu is the desired speed, r is a constant chosen by the designer, 
and 0 is the initial error in velocity, which may be positive or nega- 
tive. The value of r is a design compromise; if r is too small, then it 
will take a long time for the car to get back to the right speed, but if 
it is too big, the motion will be jerky or produce bad fuel efficiency. 
(a) Infer the units of u, b, and r. 

(b) Find the position x as a function of time. Vv 
(c) Give a physical interpretation of the constant of integration oc- 
curring in your answer to part b. 

(d) Check that your answer to part b has units that make sense. 
(e) Check your answer by differentiating it. 


q2 A piston in a car’s engine is connected to the crankshaft 
through a piston rod. As the crankshaft spins at a constant rate, the 
velocity of the piston in and out of the cylinder may be approximated 
by a function 

v = Acoswt + Bcos 2ut, 


where w (Greek letter “omega,” which makes the “o” sound) is the 
number of radians per second at which the crankshaft is rotating, 
and A and B are constants that depend on the length of the piston 
rod and the radius of the circle traveled by the piston pin. Note that 
expressions of the form sin zy are normally to be read as sin(ay); if 
the intended meaning had been (sinz)y, then one would normally 
have written it as ysin 2. 

(a) Infer the units of A and B. (The units of w are simply inverse 
seconds, s~!.) 

(b) Find the piston’s position x as a function of time. Vv 
(c) Give a physical interpretation of the constant of integration oc- 
curring in your answer to part b. 

(d) Check that your answer to part b has units that make sense. 
(e) Check your answer by differentiating it. 
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In problems s1-s12, compute the definite integrals. These are in 
groups of three similar problems, with the intention being that a 


given student would do one from each group. 


2 
s1 i. u? du v 
1 
2 
s2 / w? dw v 
1 
2 
s3 | s'/? ds v 
1 
1 
s4 [ en -snsy) dh J 
0 
1 
sd / (27 + 7z) dz v 
0 
1 
s6 / (2r4 — 2r? 4 r) dr Vv 
0 
4 
s7 ip (e79 + sing — ./g) dg v 
0 
ae a 
s8 i ( — a~9/? + cos a) da v 
1 a 
4 
s9 | (cosp +e? + p®) dp Vv 
0 
1 
s10 / u(/u+ Vu) du v 
0 
1 
sll / (1) Ge4D) ae z 
4 
2 1 2 
512 i] (3 f ) dj v/ 
1 J 
ul Is the following calculation wrong? Explain why or why not. 


1 1 
1 1 

[vars je? +0] =- 
" 2 a 2 


u2 Let the functions f and g be defined as follows. 
In(—#) +7 ifa<0O 
Or pee 
Ine+11 ifx>0 
g(x) = In|a| 


Is f an antiderivative of 1/x? Is g? Explain why or why not. 
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Chapter 9 


Basic techniques of 
integration 


9.1 Doing integrals symbolically ona 
computer 


The quaint little town of Carmel, California, has a touristy business 
district that specializes in quaint little shops. I once went into a yarn 
store there with my mother, who picked out two skeins of yarn for 
a sweater. The business ran on paper and pen, which was arguably 
sensible, since there was little room on the cramped counter for a 
cash register. The following math problem resulted: 


$5.60 
x 2 


The proprietor pulled out a calculator and typed 0 x 2 =. The 
answer was 0, which she wrote down. Then 6 x 2 =, and so on. 


The point of this anecdote is that there are right ways and wrong 
ways to use tools. Computers are a good tool for doing integrals, 
but we should be able to do simple integrals by hand. 


The computer programs used for doing integrals are called com- 
puter algebra systems (CAS). I recommend a free and open-source 
CAS called Maxima.! The following example shows how to use 
Maxima to do an easy indefinite integral — analogous to using the 
calculator to find 6 x 2. The typewriter font shows what I typed 
in, and the italicized text is the answer printed out by the program. 
Note the mandatory semicolon at the end of the input line. 


Integrating on a computer Example 1 


integrate(cos(x) ,x); 
sin(a) 





'To use it through a web browser go to maxima-online.org. To download it 
to your computer, go to maxima.sourceforge.net. 
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Gentoo 







area 


x 
-1 0 1 2 
yi000000 
area 
u 
-1 0 1 2 
a/Integrating (x —  1)1000000 


by using a change of variable. 
The function is not drawn realisti- 
cally; the rounded edge has been 
exaggerated in order to make 
the shaded area under the curve 
visible. 








b/The change of _ variables 
just renames the points on the 
horizontal axis. 


9.2 Substitution 


Here’s an example of an integral that introduces a useful technique 
of integration, and that also demonstrates what can go wrong if you 
become completely dependent on computers to do integrals that you 
should be able to do by hand. 


fe _ 1000 dz 
1 


I tested this on three CAS programs, and although two were able to 
do it, one froze up indefinitely. My point is not that a certain CAS 
is better than some other one.” The point is that computers, unlike 
humans, can’t step back and say, “Hey, what I’m doing isn’t working 
so well. Maybe I should try something else.” The one that failed 
presumably started grinding away to multiply out the polynomial 
— all million and one terms of it: 21° — 19000000x999999 + ... 
This is certainly a strategy that would work, in theory, because it 
would reduce the problem to one that we already know how to solve: 
integrating a polynomial. 


But there’s a better way to approach this, as suggested in figure 
a. Geometrically, what we’re trying to calculate is the very small 
area that is only visible at the corner of the figure. (Although the 
limits of integration run from 1 to 2, the value of the integrand is 
too small to matter except when x gets very close to 2.) Let’s shift 
the graph to the left by one unit, as shown in the figure, and define 
a new variable u = x —1. The shift to the left doesn’t change the 
amount of area under the curve; it simply relocates that area to a 
new place. In terms of this variable, the integrand is w!000°, which 
is a function that we know how to integrate. Expressed as an integral 
with respect to u, the limits of integration are fom u = 1-—1=0 
to u = 2—1= 1. Do we need to do anything to the dx other 
than change it to a du? Not in this case; implicit differentiation of 
u = «—1 gives du = da. The result is that we can calculate the 
same area using the following easier integral. 


1 
i, 11000000 du 
0 


This is easily found to equal 1/1000001. 


Figure b shows a nice way of thinking about this. Rather than 
imagining that the graph itself has shifted horizontally, we can say 
that the graph stayed in the same place, but we slid the axis over. 
This is just like renaming the points on the horizontal axis. The 
renaming is like sliding a ruler over without shrinking or expanding 
the ruler. If we think of dz as a small change in zx, and similarly 
for du, then it makes sense that du = dz; the distance or difference 





For the record, the two that could handle it were Maxima and integrals. 
com. The one that failed was another open-source program called Yacas. 
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between two points on a ruler is the same regardless of whether we 
slide the ruler around. 


The procedure demonstrated above is called a change of variable, 
substitution, or sometimes “u-substitution,” since it seems to be 
common for calculus textbooks to use the letter wu in this context. 
In general, wu can be defined as any function of x that you think will 
help to massage the integral into a more workable form. Substitution 
can be used on both definite and indefinite integrals. 


Substitution with rescaling Example 2 
A common rate of return on ultra-safe, ten-year bonds has his- 
torically been about 5%, which means that money invested in 
these bonds grows by a factor of e in about 1/1n 1.05 = 20 years. 
Therefore we expect such an investment to grow exponentially 
over time in proportion to the function e!/29, where t is in years. 
Bonds often pay dividends, and although the dividend payments 
actually occur at discrete time intervals, it can be convenient to 
model them mathematically as if they were paid continuously, so 
that the total dividend payment is 


10 
p= k | et/20 gt, 
0 


where k is a constant. Let’s evaluate this integral. 


Since the derivative of e* is e*, we know how to integrate e*, and 
it's natural to look for a substitution that makes the integrand into 
this form. The substitution clearly has to be 


t 
U= 20° (1) 
If we think of the time axis as a “time-line” like the ones in his- 
tory books, then this substitution is like expanding the time-line’s 
scale by a factor of 20. Solving for t = 20u and applying implicit 
differentiation gives 


dt = 20 du. (2) 
The limits of integration change when expressed in terms of u. 
t=0 So u=0 (3) 
1 
t= 10 = u=5 (4) 


We have to make use of all four of the equations (1)-(4) in order 
to rewrite the integral in terms of the new variable u: 


1/2 
pe k | e” (20 du) 
0 


= 20k e!] )/* 
= 20k (e'? _ 1) 
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A nonlinear substitution Example 3 
> Evaluate 


[e sin(x* + 3) dx. 
> Here the only substitution that has any hope of working is u = 
x? +3. Implicit differentiation gives du = 2x dx, which happens to 


be exactly the combination of factors that occurs in the integrand. 
The integral therefore equals: 


[sinudu=~cosu+e 
= —cos(x*+3)+¢ 


To check that this indefinite integral is correct, we can differentiate 
it, which involves using the chain rule: 


d 2 sift agee 
ag (- cos(x* + 3) + c) = sin(x* + 3) - 2x 


The method used to check example 3 shows that we should be 
able to interpret what’s going on in these substitutions in terms of 
the chain rule. The chain rule says that 


dF'(G(x)) 


at = PIG) 6), 


so that 


i F'(G(2))- @'(a) dx = F(G(2)) +e. 


In example 3, we had 2z = 4 (2? + 3). So let’s call G(x) = x? +3, 
and Fw) =—cosu. Then 


F(G(z)) = — cos(x? + 3) 


and 
dF 
dF(G(x)) = sin(x? + 3) ¢ aor = F (2); 
dx —_—— 
FG(z))  G(z) 
so that 


jp sin(x? + 3) dx = —cos(a7 +3) +c. 
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9.3 Integrals that can’t be done in closed form 


Integral calculus was invented in the age of powdered wigs and harp- 
sichords, so the original emphasis was on expressing integrals in a 
form that would allow numbers to be plugged in for easy numerical 
evaluation by scribbling on scraps of parchment with a quill pen. 
This was an era when you might have to travel to a large city to get 
access to a table of logarithms. 


In this computationally impoverished environment, one always 
wanted to get answers in what’s known as closed form and in terms 
of elementary functions. 


A closed form expression means one written using a finite num- 
ber of operations, as opposed to something like the geometric series 
lt+a+a?+23+..., which goes on forever. 


Elementary functions are usually taken to be addition, subtrac- 
tion, multiplication, division, logs, and exponentials, as well as other 
functions derivable from these. For example, a cube root is allowed, 
since 3/¢ = e(/3)* and go are trig functions and their inverses, 
because they can be expressed in terms of logs and exponentials by 
using Euler’s formula. 


In theory, “closed form” doesn’t mean anything unless we state 
the elementary functions that are allowed. In practice, when people 
refer to closed form, they usually have in mind the particular set of 
elementary functions described above. 


A traditional freshman calculus course spends such a large amount 
of time teaching you how to do integrals in closed form that it may 
be easy to miss the fact that this is impossible for the vast majority 
of integrands that you might randomly write down. Here are some 
examples of impossible integrals: 





Je tanz dx 


The first of these is a form that is extremely important in statistics 
(it describes the area under the standard “bell curve”), so you can 
see that impossible integrals aren’t just obscure things that don’t 
pop up in real life. 


People who are proficient at doing integrals in closed form gener- 
ally seem to work by a process of pattern matching. They recognize 
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certain integrals as being of a form that can’t be done, so they know 
not to try. 


Disobedience Example 4 
> Students! Stand at attention! You will now evaluate [ e~*°*”* dx 
in closed form. 


> No sir, | can’t do that. By a change of variables of the form 
u=xX+C, where cis a constant, we could clearly put this into the 
form f e~* dx, which we know is impossible. 


Sometimes an integral such as [ e-® da is important enough 
that we want to give it a name, tabulate it, and write computer sub- 
routines that can evaluate it numerically. For example, statisticians 
define the “error function” erf(x) = (2/./7) f e~* dx. Sometimes 
if you’re not sure whether an integral can be done in closed form, 
you can put it into computer software, which will tell you that it 
reduces to one of these functions. You then know that it can’t be 
done in closed form. For example, if you ask integrals.com to do 
f e-® +7 da, it spits back (1/2)e49/4\/m erf(z — 7/2). This tells you 
both that you shouldn’t be wasting your time trying to do the inte- 
gral in closed form and that if you need to evaluate it numerically, 
you can do that using the erf function. 


As shown in the following example, just because an indefinite 
integral can’t be done, that doesn’t mean that we can never do a 
related definite integral. 


> Example 5 
Evaluate jee e~ tan’ x(tan? x + 1) dx. 


> The obvious substitution to try is u = tan x, and this reduces the 
integrand to e-*’. This proves that the corresponding indefinite 
integral is impossible to express in closed form. However, the 
definite integral can be expressed in closed form; it turns out to 
be /7/2. 

Sometimes computer software can’t say anything about a par- 
ticular integral at all. That doesn’t mean that the integral can’t 
be done. Computers are stupid, and they may try brute-force tech- 
niques that fail because the computer runs out of memory or CPU 
time. For example, the integral [ dx/(x10°°° — 1) can be done in 
closed form, and it’s not too hard for a proficient human to figure 
out how to attack it, but every computer program I’ve tried it on 
has failed silently. 
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9.4 Doing an integral using symmetry or 
geometry 


Often we can figure out the value of an integral either by symmetry 
or by using simple geometry. 

‘An integral that vanishes by symmetry Example 6 
> Evaluate 





ie sin x dx 
_1 1+ ee 


> | doubt that this can be done by finding the indefinite integral 
and plugging in the limits of integration. | tried it using the open- 
source program Maxima, and also using the web interface to a 
proprietary program called Mathematica, and neither could do it. 
However, the function is odd because the numerator is odd and 
the denominator is even. Since the function is odd, and the limits 
of integration are symmetrically placed on either side of the origin, 
the definite integral is guaranteed to be zero; any negative contri- 
bution to the integral on the left is guaranteed to be canceled by 
a matching positive contribution on the right. 


‘An integral that can be done by geometry 
> Evaluate 


Example 7 


Qn 
| sin? 0 do. 
0 


> The hard way to do this integral is to dig up the appropriate trig 
identity, which allows sin? 0 to be reexpressed in terms of sin 20. 
The easy way is to look at the graph, figure d. The rectangle is 
exactly half filled by the area under the graph. Since the rectangle 
has area 27, the integral equals 7. 


0.34 





-0.3-— 


c/The integrand of example 
6. 


2 
sin 6 





6 
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d/The integrand of example 
7. 
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9.5 Some forms involving exponentials, 
rational functions, and roots 


Here are some forms whose antiderivatives may not be obvious at 
first sight. 


9.5.1 Exponentials with the base not e 


Since the derivative of e” with respect to x is just e* again, we 
already know how to integrate e*. What about exponentials with 
other bases? These can be converted into base e using the identity 


a? = e’™@ then integrated using a change of variable. 


Example 8 
> Evaluate { 3” dx. 


> 
[> dx = pews dx [using a? = e?!"4] 


{ 
~ In3 i ev du [substituting u = x In 3] 


=—~ — [a? = e? "2 again] 


9.5.2 Some forms involving rational functions and roots 


In sections 5.10-5.11, pp. 137-137, we summarized the derivatives 
of various transcendental functions. Each of these potentially gives 
some way to integrate something, by applying the fundamental the- 
orem. Some of these derivatives are not themselves transcendental 
functions, which makes it not at all obvious when looking at them 
that they should be attacked in this way: 


derivative integral 
(tan-+ x)’ = (1427)! f+?) dv=tantr+e 
tanh“! x)! = (1 —2?)-! 1—27)-! dr =tanh+2+c 
1—27)-1/2 dz =sin-!z +c 
) 
) 





—1/2 dr =sinh!x +e 
—1/2 dr =cosh !xr+c 
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Problems 


In problems al-a12, evaluate the indefinite integrals. Check your 
answer by differentiating it, and also check it online. All letters 
other than the variable of integration are constants. These are in 
groups of three similar problems, with the intention being that a 
given student would do one from each group. 





df 
1 2 Vv 
a / af 4 [f > 2] 
dw 
v 
a2 aa [w <1] 
a3 i a lq < 0] v 
q 
a4 pe dx v 
ad ye ds [c > 0] v 
a6 / 10°*? dé v 
dt 
af age 
a8 [> Vv 
(g) +1 
ad [A> 0 J 


al0 cos” ¢sin¢ d¢ 


[n # —1; ¢ is lowercase Greek zeta, which makes the “z” sound.] 


v 


all i ee dd 


(\ is lowercase Greek lambda, which makes the “I” sound.) Vv 


al2 je cos p dp v 
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In c1-c6, use a substitution to evaluate the indefinite integrals. 





in2ax d 
cl [sm (=**) dx Vv c4 > Vv 
5 1+ sina 
sin 2x dx ey c5 ace da ms 
c2 SS 
V1+cos2x Lt 
c6 ae J 
a4 He sin 2x7 dx J t 


2 
Pree c7 [Gssve- Ide 


In e1-e9, use a substitution to evaluate the definite integrals. 














el [ u du J e6 ae sin? @cos 6 dé 
1 1 + ue J 
22 
e2 / a eu Vv v2 
ay oe e7 | E(1 + 2€7)!9 dé 
5 a dx oy 
e3 [ Vv 
0 z+l 
3 
a 2 a? dx J e8 | a Vv 
; apa a4 oe 0 Ltt 
e5 cos (6 + 7/3) dé 2 
Jr ( a) af e9 i at dx Vv 
1 


In problems g1-g2, two indefinite integrals are given that involve 
functions which look similar to one of the following: 


2 sin x i 
e tanz 





x 
As discussed in section 9.3, the four functions given above can’t be 
integrated in closed form. In each pair below, one can be integrated, 
while the other can be made into one of the above forms by a sub- 
stitution, proving that it’s impossible to integrate. Determine which 
is which, integrate the one that can be done, and check your answer 
to that one online. 


gl (a) ae a dx (b) [ee da v 


g2 (a) jae sin = dx (b) fotsine dx v 
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Chapter 10 
Applications of the integral 


10.1 Probability 
10.1.1 Introduction to probability 


Measurement of probabilities 


Defining randomness is a difficult problem, tied up with classical 
philosophical issues such as determinism and free will. Mathemati- 
cians sidestep this question by simply using numbers between 0 and 
1 to represent probabilities. A zero probability represents an event 
that can’t happen, a probability of 1 an event than is guaranteed to 
happen. In between we have things that might or might not happen. 
A flipped coin comes up heads with probability 1/2. 


Statistical independence 


When ordinary people say that an event is “random,” they usu- 
ally mean not just that it has a probability greater than 0 and less 
than 1, but also that it can’t be predicted, because there is no way 
of finding a connection with another event that caused it. This lack 
of connection is considered by mathematicians to be separate from 
randomness itself, and is defined as follows. 


Definition of statistical independence 

Events A and B are said to be statistically independent if the 
probability that they will both happen is given by the product 
of the two probabilities. 


Events can be random but not independent. It might or might 
not rain tomorrow, and there might or might not be a forest fire. 
These events are both random, but they are not independent, since 
rain makes fire less likely. 


Normalization 


Suppose that we are able to exhaustively list all of the possible 
outcomes A, B, C,...of some situation, and that these outcomes are 
mutually exclusive. Then exactly one of these outcomes must occur, 
so the probabilities must add up to one. For example, suppose that 
we flip a coin, and A is the event that the coin comes up heads, 
B tails. Then P4 + Pg = $ + $ = 1. This property is called 
normalization. 


OCS 


a/The probability that one 
wheel on the slot machine will 
give acherry is 1/10. If the three 
probabilities are independent, 
then the probability that all three 
wheels will give cherries is 
1/10 x 1/10 x 1/10. 





b/The earth’s surface is 30% 
land and 70% water. If we spin 
a globe and pick a random 
point, the probabilities of hitting 
land and water are 0.3 and 0.7. 
Normalization requires that these 
two probabilities add up to 1. 
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> 
a) 


c/The sum of the two dice 
is a random variable with possible 
values running from 2 to 12. 
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d/The histogram shows the 
probabilities of the various out- 
comes when rolling two dice. 
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height (cm) 


prob. per unit height (cm~-) 


e/A_ probability distribution 
for height of human adults. (Not 
real data.) 


10.1.2 Continuous random variables 


When numerical values are assigned to outcomes, the result is 
called a random variable. The sum of the rolls of two dice is a 
random variable, and we can assign probabilities to the different 
results. For example, the probability of rolling 2 is 1/36, since the 
probability of getting a 1 on the first die is 1/6, and similarly for 
the second die. All of the relevant information about probabilities 
can be summarized by the discrete function shown in figure d. 


But when a random variable is continuous rather than discrete, 
we usually cannot make a useful graph of the probabilities, because 
the probability of any particular real number is typically zero. For 
example, there is zero probability that a person’s height h will be 
160 cm, since there are infinitely many possible results that are close 
to that value, such as 159.999999999999996876876587658465436 cm. 
What is useful to talk about is the probability that h will be less 
than a certain value. The probability of h < 160 cm is about 0.5. In 
general, we define the cumulative probability distribution P(x) of a 
random variable to be the probability that the variable is less than 
or equal to x. We can then define the probability distribution of the 
variable to be 

D(x) = P'(z). (1) 


Figure e shows an approximate probability distribution for human 
height. Suppose we want to know the probability that our random 
variable lies within the range from a to b. This is P(b) — P(a). By 
the fundamental theorem of calculus, this can be calculated from 
the definite integral of the distribution, 


b 
Rey Pla i BG) de. 2) 


That is, areas under the probability distribution correspond to prob- 
abilities. If the random variable has some units, say centimeters, 
then the units of the probability distribution D are the inverse of 
those units, e.g., cm7! in our example. In this example, D can be in- 
terpreted as the probability per centimeter. A uniform distribution 
is one for which D is a constant throughout the range of possible 
values of x. 


An extremely common bell-shaped probability distribution is 


called the “normal” or “Gaussian” distribution, which we encoun- 
tered in section 8.7.1, p. 195. 


If there are definite lower and upper limits Z and U for the 
possible values of the random variable, then normalization requires 
that 


U 
= [ DG) dz. (3) 
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The average Z of a variable that takes on one of two discrete 
values with equal probability is (#1 + x2)/2, which is the same as 
z1P,+22P). Generalizing this to a continuous random variable, we 
have 


U 
C= | xD(x) dex. (4) 
L 
The average is also known as the mean, expectation, or the expected 
value of x. 


The standard deviation o, of a random variable x is a measure 
of how much it varies around its average value. The symbol a is the 
lowercase Greek “sigma.” (Recall that uppercase sigma is ©.) The 
standard deviation of a continuous random variable is defined by 


U 
Gee i (e — #)?D(2) da. (5) 
L 


10.1.3 One variable related to another 





It often happens that one random variable y is defined by some 
function of some other random variable x. In an experiment, for 
example, one may measure zx directly, and the value of x is a random 
variable because of the finite precision of the measurement. If one 
calculates the result of the experiment using some function y(2), 
then the result is also a random variable. Let the corresponding 
probability distributions and cumulative probability distributions 
be D,, Dy, and let P be the cumulative probability for a given x or 
y. Then D, can be determined from D, by the chain rule: 





dP 
D= or [definition of D| 
dP 
ee i [chain rule] 
= Dy a [definition of D} 
dy 
D, ee : : 
=— [derivative of the inverse of a function] 
y'(x) 
A random goblin Example 1 


Often in computer simulations or games one wants to produce 
a random number with some desired distribution. For example, 
in a fantasy adventure game, we might wish to generate an op- 
ponent such as a goblin whose strength statistic y is distributed 
according to some bell-shaped curve Dy with a given mean and 
standard deviation. The random number generators supplied in 
computer programming libraries usually output a number x with 
a uniform distribution from 0 to 1, so that D, = 1. We then have 
y/(x) = 1/Dy. Integrating both sides of this equation allows us to 
find a function y(x) that determines the strength of the goblin. 
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>Box 10.1 Applications to 
economics 


The following is an index of 
applications of calculus to eco- 
nomics that occur throughout 
this book. 


Dp. application 
18 marginal derivative 


rate of 
substitu- 
tion 
59 economic extrema 
order 
quantity 
106 the Laffer Rolle’s 
curve theorem 
115 supply intermediate 
and value 


demand theorem 


10.2 Economics 


In 1882, at the age of 46, William Stanley Jevons went swim- 
ming in the ocean and drowned. As a pioneer of classical economics, 
Jevons developed mathematical models that treated humans as ra- 
tional actors seeking to maximize their happiness. His choice to go 
swimming that day was presumably based on the fact that swim- 
ming would cause him to be happy, and on the conscious or uncon- 
scious expectation that his risk of death would be low. But how 
do we define “rational” and “happiness” mathematically? Believe 
it or not, economists did produce definitions of these ideas, but in 
the process the word “happiness” changed to “utility,” and the con- 
cepts morphed into forms that were very different from their original 
meanings. They are central to modern economics. 


A 1947 paper by John von Neumann and Oskar Morgenstern 
(VNM) introduces four axioms defining rationality, which I’ll de- 
scribe here in English rather than equations: 


1. Preferences are consistent. 


2. Preferences are transitive: if you like outcome A more than B, 
and B more than C, then you like A more than C. 


3. No outcome is infinitely good or bad. For example, if Jevons 
had believed that death was infinitely bad, he might have been 
unwilling to accept any risk of drowning. (Cf. example 11, 
p. 113.) 


4. A preference for A over B holds regardless of whether some 
other outcome exists. For example, if you like Bach more than 
bebop, this is true regardless of whether it rains. 


VNM prove that if these axioms hold, it is possible to assign a real 
number u(x), called the utility function, to any outcome x such that 
a rational actor always maximizes the expected value of u as defined 
by equation (4), p. 217. The utility function can be rescaled or have 
a constant added to it, but is otherwise unique. 


Although I’ve described this in terms of human preferences, the 
axioms may fail for humans or hold for non-humans. It only matters 
if the actor behaves as if it were acting rationally, as defined by the 
axioms. Milton Friedman writes: 


I suggest the hypothesis that the leaves [on a tree] are 
positioned as if each leaf deliberately sought to maximize 
the amount of sunlight it receives, given the position of 
its neighbors, as if it knew the physical laws determining 
the amount of sunlight that would be received in vari- 
ous positions and could move rapidly or instantaneously 
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from any one position to any other desired and unoccu- 
pied position. 


Daniel Kahneman, on the other hand, won the Nobel prize for his 
work showing that humans often violate the VNM definition of ratio- 
nality, but in ways that can be described scientifically. For instance, 
he showed in experiments that subjects were willing to pay one price 
for a trinket such as a mug, but that if they were given the mug, 
they demanded a different and systematically higher price to sell 
it. This violates axiom 1. Axiom 1 was implicitly assumed in the 
description of the indifference curve in example 2, p. 18. 


Playing the lottery Example 2 
Joe is broke and homeless. He currently has an amount of money 
X = 0. Joe’s utility function is given by 


1-—e™%, 


where x is in some appropriate units such as thousands of dol- 
lars. The shape of this function is shown in figure f. It is concave 
down, which is a feature that is almost always realistic for a utility 
function that depends on how much money someone has. If Joe 
is broke and gains $10, he’s really happy, whereas if Bill Gates 
saw a $10 bill on the sidewalk, he probably wouldn’t bother to 
bend over and pick it up. 


Joe knows of a lottery in which each player receives a random 
amount of money uniformly distributed on the interval from 0 to 1. 
What price L should Joe be willing to pay for the lottery ticket, if 
he has the opportunity to borrow the price from his mother? 


If Joe enters and receives the minimum payout of 0, he will have 
X = —L, i.e., he will be in debt to his mother for the price of the 
ticket and have nothing to show for it. If he gets the maximum 
reward of 1, he will have x = 1—L. Since this interval has width 1, 
and the result is uniformly distributed, normalization requires that 
D(x) = 1 within the interval. We find his expected utility. 


i [wow dx = [co —e-*) dx 


L —L 


=X+ eS 1- (1 -e-") ef 


Joe’s current utility function is u(O) = 0, so it is rational for him to 
pay any amount L that gives him d > 0. The result is 
L<-—In (1 = e') ~ 0.46. 


If Joe’s utility function had been u(x) = x, then he should have 
been willing to pay 0.5 units of money for a chance to win between 
0 and 1 units. But because his utility function is nonlinear, he is 
willing to pay less than that; he is risk-averse. 


Section 10.2 
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>Box 10.2 Applications to 
physics 


The following is an index 
of applications of calculus to 
physics that occur throughout 
this book. 


Dp. application 


22 _-velocity derivative 
75 nuclear extrema 
stability 
83 acceleration 2nd 
derivative 
88  Newton’s 2nd 
2nd law derivative 
89 jerk and 3rd 
damage derivative 
158 lever related 
rates 
161 pulley implicit 
differenti- 
ation 
187 work definite 
integral 


188 constant- indefinite 
acceleration integral 
motion 


10.3 Physics 


A conservation law is a physical law stating that the total amount 
of a certain quantity stays constant. (This usage of “conservation” 
doesn’t have the usual connotation of not using something up. In 
this context, the word implies that you couldn’t use it up if you tried, 
because the total amount can’t go down!) Some important exam- 
ples of conserved quantities are mass, energy,! momentum, electric 
charge, and angular momentum (a measure of rotational motion). 
Conservation laws play a central role in physics. They are more fun- 
damental than Newton’s laws of motion. For example, a ray of light 
can be described by conservation of energy, but we get nonsense if 
we try to apply Newton’s laws to it (m = 0, so we can’t compute 
a= F/m). 


Calculus deals with rates of change and the accumulation of 
change, so it would seem to have no application to variables that 
are guaranteed never to change! But conserved quantities can be 
transferred or transformed at some rate. For example, we estimated 
in example 9, p. 53, that hiking burns about 200 calories per hour. 
The calorie is a unit of energy.? This number represents the rate at 
which food energy is being transformed into other forms of energy 
such as body heat. For each conserved quantity, it’s of interest to 
define a name, symbol, and unit for its rate of transfer or trans- 
formation. We then have two variables, which are related to one 
another as integral and derivative with respect to time. In the fol- 
lowing table, the conserved quantity is given on top along with its 
symbol and SI unit. Its derivative is the variable below. 








angular electric 
mass energy momentum momentum charge 
m E Dp L q 
kg joule, J N-s N-m-s coulomb, C 
power force torque current 
P F T I 
kg/s watt, W newton,N N-m ampere, A 





Since the SI unit of time is the second (s), we have the following im- 
plied relationships between some of the units: W=J/s and A=C/s. 


The definitions of the conserved quantities are ultimately op- 
erational definitions, meaning definitions that state the operations 
needed in order to measure them. This may seem unsatisfactory, 
but history has shown that every attempt at a “pure” conceptual 
or mathematical definition has had to be revised. We can however 





1 According to Einstein’s famous EF = mc”, mass and energy are equivalent or 


interconvertible, so they aren’t separately conserved. Their separate conserva- 
tion is however a good approximation in ordinary life, where relativistic effects 
are negligible. 

Food calories are actually kilocalories, 1 kcal=1000 cal. The SI unit is not 
the calorie but the joule. 
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give rough conceptual definitions that are valid within the field of 
mechanics, i.e., the study of material objects: 


Mass is a measure of inertia. How hard is it to change the motion 
of a certain object? 


Momentum is a measure of the motion of an object. Suppose 
our object hits another object, the “target.” Knowing the 
momentum allows us to predict how strongly a standard target 
will recoil. Momentum has a direction in space. 


Energy comes in various forms such as kinetic energy (energy of 
motion), heat (which is random motion at the atomic level), 
and electrical energy (such as the chemical energy in food). 
Energy has no direction. 


Box 10.3 gives some examples of equations for conserved quantities. 


Energy of an accelerating car Example 3 
> A car of mass m starts moving from rest with a constant ac- 
celeration a. If the speed is small enough, then air resistance is 
negligible, and the power required from the engine at time f is 


P=kmé#t, 


where the unitless fudge factor k accounts for inefficiency of the 
engine and frictional heating in the tires, and is assumed to be 
constant. Find the energy expended by burning gas as a function 
of time. 


> Because the power isn’t constant, we can’t simply multiply “the” 
power by the time ¢. The integral is needed here as the correct 
generalization of multiplication (section 8.4.1, p. 186). 


ES JP dt [integral-derivative relationship of E and P] 


[let initial energy consumption=0] 


For motion with constant acceleration, v = at + Vo, where Vo = 0 
here because the car starts from rest. The result can therefore be 
rewritten as (1/2)kmv. The factor (1 /2)mv? is called the kinetic 
energy of the car. If the car was perfectly efficient, we would have 
k = 1, and all the energy expended would go into kinetic energy, 
rather than frictional heating. 


Section 10.3 


>Box 10.8 Examples’ of 
equations for conserved 
quantities 


Let a material object of 
mass m be moving at a veloc- 
ity v that is small compared to 
the speed of light. Then exper- 
iments show that its momen- 
tum and kinetic energy are ap- 
proximately p = mv and EF = 
(1/2)mv?. 


If a ray of light has energy 
E, then its momentum is p = 
E/c, where c is the speed of 
light. This momentum is too 
small to matter in everyday life. 


If a material object moves 
at a speed that is not small 
compared to c, then it has p= 


mu//1 — v?/c?. 


Let a ring with mass m and 
radius r rotate about its own 
axis so that each point on it 
moves at speed v. Then its 
angular momentum is +mur, 
with the sign indicating the di- 
rection of rotation. 
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Problems 


al A computer language will typically have a built-in subroutine 
that produces a fairly random number that is equally likely to take 
on any value in the range from 0 to 1. Find the standard deviation. 


v 


a2 A laser is placed one meter away from a wall, and spun on 
the ground to give it a random direction, but if the angle 6 shown 
in the figure doesn’t come out in the range from 0 to 7/2, the laser 
is spun again until an angle in the desired range is obtained. 

(a) Find the probability distribution Dg of the variable 0. 

(b) Using the technique described in section 10.1.3 on p. 217, find 
Problem a2. the probability distribution D, of the distance x shown in the figure. 
Vv 





a3 A computer language will typically have a built-in subroutine 
that produces a fairly random number that is equally likely to take 
on any value in the range from 0 to 1. If you take the absolute 
value of the difference between two such numbers, the probability 
distribution is of the form D(x) = k(1— <2). (a) Find the value of 


the constant k that is required by normalization. Vv 
(b) Find the average value of x. v 
(c) Find the standard deviation. Vv 





cl Scientists in Daniel Lieberman’s Skeletal Biology Lab at Har- 

vard specialize in measuring the forces that act on a runner’s body, 

§ which may help to improve coaching, reduce injuries, or provide sci- 

[ entific evidence about whether barefoot running is healthier than 

ay using running shoes. The graph in the figure shows a typical result 
Le 

The end of the graph, where the force goes to zero, is the time 


for the vertical force as a function of time that acts between the 
runner’s foot and a treadmill, for one portion of a stride cycle. 

Problem c1. at which the runner’s back toe leaves the ground and he becomes 
airborne for a fraction of a second. 


— 


S 


The initial time t = 0 is the one when the vertical force is at its 

greatest, shown in the drawing. At this time, the runner’s body is 

about as low as it will get, and the vertical momentum is approxi- 
t mately zero. 


upward force on foot 
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The graph looks like a parabola, so let’s model it as one, F = 
b(1 — t?/7?) — w, where 7 is the time at which the graph ends, and 
the —w term accounts for gravity. (a) Infer the units of the constants 
b, 7, and w. (b) Find the runner’s vertical momentum at t = 7, ie., 
the momentum with which he takes off into the air. (c) Check that 
your answer to part b has units that make sense. Vv 


el In example 2, p. 219, we found the maximum amount that a 
person should be willing to pay for a lottery ticket, given a certain 
utility function. We assumed the utility function to be concave 
down, which is usually realistic, for the reasons discussed in the 
example. But there can also be cases where the utility function is 
concave up. Suppose that Sally has cancer and no health insurance. 
She can only survive if she gets expensive treatment, which she 
can’t presently afford. A small amount of money does her very 
little good, except that it slightly reduces the amount she still needs 
to get together for the treatment. In this situation, it might make 
sense to posit a concave-up utility function, such as u(x) = e” — 1, 
in the notation of the previous example. Redo the example with 
this utility function. Vv 
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Answers and solutions 


Solutions to homework problems 
Solutions for chapter 1 


Page 36, problem a5: 

The graph fails the vertical line test: a vertical line can pass through 
more than one point on the graph, meaning that there can be more 
than one pressure for a given temperature. Therefore p is not a 
function of T. 


If we were to interchange the axes of the graph, it would pass 
the vertical line test. Therefore TJ’ can be described as a function of 
p. For a given pressure, there is only one temperature. 


Page 36, problem a6: 

A line will not be a function when it fails the vertical line test, i-e., 
when the line itself is a vertical line. Such a line is a set of points 
for which x is a constant. The equation (...)e+(...)y+(...) =0 
can only be reduced to x = constant if the coefficient of y is zero. 


Page 36, problem a7: 

All of them pass the vertical line test except for x = y?, which has 
two y values for every positive x value. E.g., for « = 4, a vertical 
line passes through both y = 2 and y = —2. 


Page 36, problem a8: 

We have a set of points that are included in the set, which are 
those for which the given polynomial is negative. The set of points 
that are not included are those for which the polynomial is zero 
or positive. There is an edge or boundary between these two sets, 
consisting of any points at which the polynomial is zero, i.e., the 
roots of the polynomial. We could use the quadratic formula to find 
these roots. But since u = 0 is clearly a root, it’s simpler just to 
factor the polynomial into u(u — 2), which tells us that the other 
root is 2. Clearly the set S must be either the interval (0,2) or 
everything that lies outside this interval. Checking u = 1, we see 
that it’s the former possibility that holds. Thus a simpler description 
is S = {ulu > 0 and u < 2}. 


Page 37, problem cl: 

The derivative is a rate of change, so the derivatives of the constants 
1 and 7, which don’t change, are clearly zero. The derivative can be 
interpreted geometrically as the slope of the tangent line, and since 
the functions t and 7t are lines, their derivatives are simply their 
slopes, 1, and 7. All of these could also have been found using the 
formula that says the derivative of t* is kt*—!, but it wasn’t really 
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necessary to get that fancy. To find the derivative of t?, we can use 
the formula, which gives 2t. One of the properties of the derivative 
is that multiplying a function by a constant multiplies its derivative 
by the same constant, so the derivative of 7t? must be (7)(2t) = 14t. 
By similar reasoning, the derivatives of t? and 7t? are 3t? and 21#?, 
respectively. 


Page 37, problem c2: 

They are the same function. A function is a graph that satisfies 
the vertical-line property. Both functions have all the same points 
in their graphs, so the two definitions have defined the same graph, 
which is the same function. 


Page 37, problem c3: 

Let m be the national budget surplus. For a brief period in an 
economic boom during the Clinton administration, the U.S. federal 
government had a budget surplus, so m was positive. Later, the 
economy cooled down and m became negative again — which is its 
normal state in the modern era. At some point in time t, m had to 
change from being positive to being negative, so m(t) = 0. At that 
moment, m was decreasing, so m(t) < 0. 


Page 37, problem d1: 

The addition property of the derivative tells us that we can break 
this down into the sum of the derivatives (3x4)’, (—227)’, (2)’, 
and (1)’.. The derivative of the final, constant term is zero by 
the constant property. Using the power rule and adding, we have 
12x? — 42 +1. 


Page 38, problem el: 

One of the properties of the derivative is that the derivative of a 
sum is the sum of the derivatives, so we can get this by adding up 
the derivatives of 3z’, —4z?, and 6. The derivatives of the three 
terms are 212°, —8z, and 0, so the derivative of the whole thing is 
212° — 82. 


For the numerical check, let’s use z = 1 and Az = 0.001. Call 
the function f. 





df 
oe 18 
dz 
A i —5, 
f = 5.0131 — 5.0000 -~ 131 
Az 0.001 


These agree well enough that it’s unlikely that we’ve made an error 
such as a wrong sign or getting the wrong integer for one of the 
coefficients. 


Page 38, problem e6: 
The first thing that comes to mind is the function f defined by 
f(x) = 7x. Its graph would be a line with a slope of 7, passing 
through the origin. Any other line with a slope of 7 would work too, 
e.g., 7e +1 and 7x — 42. 
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Page 40, problem il: 

This is exactly like problem el, except that instead of explicit nu- 
merical constants like 3 and —4, this problem involves symbolic 
constants a, b, and c. The result is 2at + b. 


Page 42, problem m1: 

When the vertical stretch factor r is a natural number, that means 
that the function rf can be written as f+ f+...+ f, where the 
number of terms in the sum is r. By the addition property of the 
derivative, the derivative of rf is then f’+ f’+...+ f’, which is the 
same as rf’. This is the vertical stretch property. 


Page 43, problem n1: 

If the width and length of the rectangle are t and u, and Rick is 
going to use up all his fencing material, then the perimeter of the 
rectangle, 2t + 2u, equals L, so for a given width, t, the length is 
u = L/2—t. The area is a = tu = t(L/2 —t). The function only 
means anything realistic for 0 < t < L/2, since for values of t outside 
this region either the width or the height of the rectangle would be 
negative. The function a(t) could therefore have a maximum either 
at a place where da/ dt = 0, or at the endpoints of the function’s 
domain. We can eliminate the latter possibility, because the area is 
zero at the endpoints. 


To evaluate the derivative, we first need to reexpress a as a 
polynomial: 


a=—t? + are 
2 
The derivative is 
de —2t+ e 
dt 2 


Setting this equal to zero, we find t = L/4, as claimed. 


Page 43, problem n2: 

Since polynomials don’t have kinks or endpoints in their graphs, 
the maxima and minima must be points where the derivative is 
zero. Differentiation bumps down all the powers of a polynomial 
by one, so the derivative of a third-order polynomial is a second- 
order polynomial. A second-order polynomial can have at most two 
real roots (values of t for which it equals zero), which are given by 
the quadratic formula. (If the number inside the square root in the 
quadratic formula is zero or negative, there could be less than two 
real roots.) That means a third-order polynomial can have at most 
two maxima or minima. 


Page 44, problem rl: 
The approximation we’re going to use is 
dy Ay 
da Aa’ 
Since we want an answer valid to three decimal places, it might 
be reasonable to try a Az value such as 0.0001, since that’s a lot 
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smaller than 1073. We then have: 
Ay — 1/(1— 0.0001) — 1/(1 — 0) 
Ag 0.0001 — 0 
It looks like we’re getting 1 as our answer. To see if the result is 
really valid to three decimal places, we can try making Az smaller, 
and see how much the result changes. With Ax = 10-°, we get 
1.00001. The change is in the fifth decimal place, so it looks like the 
first three decimal places are correct. 


= 1.00010 





Page 45, problem s1: 
(a) We have 


yo kt 
(b) Here n = 2, so a relative error of 0.1% in the length will cause 
a 0.2% error in the area. 


Page 45, problem s2: 

Thinking of the rocket’s height as a function of time, we can see 
that goal is to measure the function at its maximum. The deriva- 
tive is zero at the maximum, so the error incurred due to timing is 
approximately zero. She should not worry about the timing error 
too much. Other factors are likely to be more important, e.g., the 
rocket may not rise exactly vertically above the launchpad. 


Solutions for chapter 2 


Page 69, problem el: 
Reexpressing 3/x as x!/9, the derivative is (1/3)a~2/°. 


Page 69, problem e2: 

(a) Using the chain rule, the derivative of (a? + 1)'/? is (1/2)(a? + 
1)- 422) = (2? 4 1)-V?. 

(b) This is the same as a, except that the 1 is replaced with an a?, 
so the answer is a(x? + a?)~!/?. The idea would be that a has the 
same units as 2. 

(c) This can be rewritten as (a + x)~'/?, giving a derivative of 
(—1/2)(a + 2)-9??, 

(d) This is similar to c, but we pick up a factor of —2zx from the 
chain rule, making the result ax(a — 2?)~°/?. 


Page 70, problem e4: 

The vertical stretch rule says that stretching a function y(x) verti- 
cally to form a new function ry(x) multiplies its derivative by r at 
the corresponding points. That is, ifr is a constant, then (ry)’ = ry’. 
To prove this using the product rule, we have 


(ry =r'yty'r. 
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But if r’ is a constant, then r’ = 0, so the first term is zero, and we 
have the claimed result. 


Page 71, problem i2: 
Let P be the point (1,1), and let Q lie on the graph at x = 1+ dz. 
The slope of the line through P and Q is 


A 
slope of line PQ = x 
_ (1+dzr)?-1 
~ (1+dz)-1 
_ 3dx4+3dr? + dx? 





dx 


Discarding the dx? and dz® terms, this becomes 3, which is the same 
as the result we got by doing limits. 


Page 71, problem i3: 

This would be a horrible problem if we had to expand this as a 
polynomial with 101 terms, as in chapter 1! But now we know the 
chain rule, so it’s easy. The derivative is 


[100(2ax + 3)°?] [2], 


where the first factor in brackets is the derivative of the function 
on the outside, and the second one is the derivative of the “inside 
stuff.” Simplifying a little, the answer is 200(2x + 3)%. 


Page 71, problem i4: 
Applying the product rule, we get 
100(x + 1)99 (a + 2)? + 200(a + 1)1°°(a + 2)1°9. 
(The chain rule was also required, but in a trivial way — for both 
of the factors, the derivative of the “inside stuff” was one.) 


Page 71, problem i5: 
The chain rule gives 


(PP = 2((#*)?)(2(#*)) (2x) = 827, 

x 

which is the same as the result we would have gotten by differenti- 
ating x°. 


Page 71, problem i6: 
Converting these into Leibniz notation, we find 


df _ dg 

dx dh 
and 

df_dg , 

dx dh 
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To prove something is not true in general, it suffices to find one 
counterexample. Suppose that g and h are both unitless, and x has 
units of seconds. The value of f is defined by the output of g, so 
f must also be unitless. Since f is unitless, df/dxz has units of 
inverse seconds (“per second”). But this doesn’t match the units 
of either of the proposed expressions, because they’re both unitless. 
The correct chain rule, however, works. In the equation 


af dg dh 
dx dh dz’ 
the right-hand side consists of a unitless factor multiplied by a fac- 


tor with units of inverse seconds, so its units are inverse seconds, 
matching the left-hand side. 


Page 74, problem p1: 

We can make life a lot easier by observing that the function s(f) 
will be maximized when the expression inside the square root is 
minimized. Also, since f is squared every time it occurs, we can 
change to a variable x = f?, and then once the optimal value of x 
is found we can take its square root in order to find the optimal f. 
The function to be optimized is then 


a(a — f2)? + ba. 
Differentiating this and setting the derivative equal to zero, we find 
2a(x — fo) +b=0, 
which results in 2 = f? — b/2a, or 


f= / f2 — b/2a, 


(choosing the positive root, since f represents a frequencies, and 
frequencies are positive by definition). Note that the quantity inside 
the square root involves the square of a frequency, but then we take 
its square root, so the units of the result turn out to be frequency, 
which makes sense. We can see that if b is small, the second term 
is small, and the maximum occurs very nearly at fo. 


There is one subtle issue that was glossed over above, which is 
that the graph on page 74 shows two extrema: a minimum at f = 0 
and a maximum at f > 0. What happened to the f = 0 minimum? 
The issue is that I was a little sloppy with the change of variables. 
Let J stand for the quantity inside the square root in the original 
expression for s. Then by the chain rule, 


ds ds dI dz 
df di dx df’ 
We looked for the place where dI/ dx was zero, but ds/df could also 


be zero if one of the other factors was zero. This is what happens 
at f = 0, where da/df = 0. 
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Page 78, problem t1: 
The graph looks like this: 


Clearly it has a kink in it. No matter how far we zoom in, the 
kink will never look like a line. The function is not differentiable at 
x=0. 


Page 78, problem t2: 

The function f(x) = 1/sinz can be written as a composition f(x) = 
g(h(x)) of the functions g(a) = 1/x and h(x) = sinx. We don’t have 
to recall anything about the sine function, h, except that it looks 
like a sine wave, so that it’s clearly continuous and differentiable 
everywhere. The function g, on the other hand, is discontinuous 
at 0, so it will be discontinuous at any x such that sinz = 0, and 
f will also be discontinuous in these places. The relevant values 
of x are {...,—27,—7,0,7,27,...}. Since f is discontinuous at 
these points, it is also nondifferentiable there, because discontinuity 
implies nondifferentiability. 


Page 78, problem t3: 
A cusp will occur if both branches are vertical at x = 0, ie., if f’ 
blows up there. 


For positive values of x, the definition of f is the same as x?, so 
by the power rule f’ = px?—!. For negative x, the horizontal flip 
property of the derivative (p. 16) tells us that f’ equals minus the 
value of the derivative at the corresponding point on the right. 


For p < 1, the derivative blows up, and f has a cusp. 


If f is to be differentiable at x = 0, then it can’t have a kink. By 
the symmetry property described above, this requires that f’(0) = 0. 
This occurs if p > 1. The function is nondifferentiable when p < 1. 


Page 80, problem yl: 
We can derive a three-factor product rule by grouping the three 
factors into two factors, and then applying the two-factor rule. 


(fgh)’ = [(fa)h]’ 
=(fgh+h'fg 
=(f'gt+g'fht+h'fg 
= f'gh+ghf +h'fg 


Solutions for chapter 3 
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Page 92, problem al: 
The first derivative is 122° — 8z. Differentiating a second time, we 
get 3627 — 8. 


Page 92, problem cl: 

The first derivative is 3t? + 2t, and the second is 6t +2. Setting this 
equal to zero and solving for t, we find t = —1/3. Looking at the 
graph, it does look like the concavity is down for t < —1/3, and up 
for t > —1/3. 





-4 


Page 92, problem c2: 
Since f, g, and s are smooth and defined everywhere, any extrema 
they possess occur at places where their derivatives are zero. The 
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converse is not necessarily true, however; a place where the deriva- 
tive is zero could be a point of inflection. The derivative is additive, 
so if both f and g have zero derivatives at a certain point, s does as 
well. Therefore in most cases, if f and g both have an extremum at 
a point, so will s. However, it could happen that this is only a point 
of inflection for s, so in general, we can’t conclude anything about 
the extrema of s simply from knowing where the extrema of f and 
g occur. 


Going the other direction, we certainly can’t infer anything about 
extrema of f and g from knowledge of s alone. For example, if 
s(x) = x, with a minimum at x = 0, that tells us very little about 
f and g. We could have, for example, f(x) = (2 — 1)?/2 — 2 and 
g(x) = (x +1)?/2 +1, neither of which has an extremum at x = 0. 


Solutions for chapter 4 


Page 121, problem al: 





x /eti-VJVzr-1 
1000 .032 
1000, 000 0.0010 


1000, 000,000 0.00032 


The result is getting smaller and smaller, so it seems reasonable 
to guess that the limit is zero. 


Page 121, problem a2: 

If Ry is finite and Ry approaches infinity, then 1/R2 is approaches 
zero. 1/R, +1/R2 approaches 1/R,, and the combined resistance 
R approaches from R,. Physically, the second pipe is blocked or 
too thin to carry any significant flow, so it’s as though it weren’t 
present. 


If R; is finite and Rz gets very small, then 1/Rz gets very big, 
1/R, + 1/R, is dominated by the second term, and the result is 
basically the same as Ro. It’s so easy for water to flow through Ro 
that R; might as well not be present. In the context of electrical 
circuits rather than water pipes, this is known as a short circuit. 


Page 121, problem cl: 

The shape of the graph can be found by considering four cases: large 
negative x, small negative x7, small positive x, and large positive 2x. 
In these four cases, the function is respectively close to 1, large, 
small, and close to 1. 
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-3 -2 


-1 We <0 


The four limits correspond to the four cases described above. 


Page 123, problem c8: 





For x approaching 4 


too, the x? term dominates, and the function 


approaches zero. Therefore the function has a horizontal asymptote 


at zero. 


Each root of the 
to a vertical asympt 


polynomial in the denominator will correspond 
ote. These roots can be determined from the 


quadratic formula, which contains the square root of b? — 4ac, called 
the discriminant. If the discriminant is greater than zero, then there 
will be two asymptotes, corresponding to the positive and negative 
roots of the discriminant. If the discriminant is zero, then there will 


be only one real root 


and one vertical asymptote. If the discriminant 


is negative, then there are no real roots and no vertical asymptotes. 
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Page 123, problem c9: 

It has a vertical asymptote where the denominator blows up, at 
x = —1. It has horizontal asymptotes at y = 1, since in the limits 
as x approach too, the numerator and denominator are dominated 
by the x” terms, and the constant terms become unimportant. 


Page 123, problem c10: 
The function 
Die 24 =1 
#a\= x 1 3 
TO \ 7252 ge44 


is not given in the form of a rational function, and the most straight- 
forward thing to do here would be simply to change it into that form. 
Before we do that, however, we could look for values of x at which 
the quantity inside the parentheses would go to zero; these would 
be the vertical asymptotes. Setting the denominator equal to zero 
gives (x? + 1)(x?2 + 4) = (a? + 2)(x? + 3), which simplifies to 4 = 6. 
There are no solutions, and therefore the function has no vertical 
asymptotes. 














Going ahead and recasting it as a rational function, we first need 
to put the two terms over a common denominator. This gives 








— ( (a? +1)(a? +4) — (#2 + 2)(a? +3)\* 
F(a) = ( (a2 + 2)(x2 +4) ) 


which simplifies to 





f(x)= (= ea x + ai 


—5( + 2)(x? + 4). 


We now see that the exotic-looking function was in fact just a poly- 
nomial in disguise. Polynomials don’t have horizontal or vertical 
asymptotes. 


Page 123, problem e1: 

Clearly f will be a non-decreasing function and will asymptotically 
approach 1 as x approaches infinity. We can also say something 
about the value of f’(0). Bounty hunting is a nasty, dirty, dangerous 
business that requires a significant up-front investment. Therefore 
we don’t expect any bounty hunters to become active unless x is 
high enough to give them some expectation of making a profit, and 
we expect both f(0) = 0 and f’(0) = 0, and the function should be 
essentially zero until it starts to rise at some finite value of x. 
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f%) 


asymptote 





Page 124, problem g1: 


oem 





t 





t 


No 





Page 124, problem k1: 

If f” is continuous and sometimes positive and sometimes nega- 
tive, then by the intermediate value theorem there is a point where 
f" (x) = 0. (This is the part of the argument that fails for a function 
on the rationals.) Furthermore, we must have some such x at which 
f” changes sign, and this is by definition a point of inflection. 


Solutions for chapter 5 


Page 138, problem al: 

A point on the unit circle has coordinates (x,y) = (cos0,sin@), 
where @ is the angle measured counterclockwise from the x axis. If 
we want both sine and cosine to be negative, then we need a point on 
the unit circle that lies in the third quadrant, excluding the points 
that coincide with the axes. That means 0 € (7, 37/2). 


Page 139, problem c1: 
By the chain rule, the result is 2/(2t + 1). 


Page 139, problem c2: 

We need to put together three different ideas here: (1) When a 
function to be differentiated is multiplied by a constant, the constant 
just comes along for the ride. (2) The derivative of the sine is the 
cosine. (3) We need to use the chain rule. The result is ab cos(bx+c). 


Page 139, problem c3: 

The derivative of e” is e-7, where the first factor is the derivative 
of the outside stuff (the derivative of a base-e exponential is just 
the same thing), and the second factor is the derivative of the inside 
stuff. This would normally be written as 7e™. 


The derivative of the second function is e® e*, with the second 
exponential factor coming from the chain rule. 
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Page 139, problem c4: 
To find a maximum, we take the derivative and set it equal to zero. 
The whole factor of 2v?/g in front is just one big constant, so it 
comes along for the ride. To differentiate the factor of sin@ cos 0, 
we need to use the chain rule, plus the fact that the derivative of 
sin is cos, and the derivative of cos is — sin. 

Qu? : , 

0 = —(cos6@ cos@ + sin @(— sin 6)) 
g 
0 = cos? 6 — sin? 0 


cos@ = +sin@ 





We’re interested in angles between, 0 and 90 degrees, for which both 
the sine and the cosine are positive, so 


cos @ = sin@ 
tan@ = 1 
6 = 45°. 


To check that this is really a maximum, not a minimum or an in- 
flection point, we could resort to the second derivative test, but we 
know the graph of R(@) is zero at 0 = 0 and 6 = 90°, and positive 
in between, so this must be a maximum. 


Page 139, problem c5: 
Since I’ve advocated not memorizing the quotient rule, I’ll do this 
one from first principles, using the product rule. 

d 


qg "ane 


d /siné 
~ dO (= 5) 
= < [sino (cos 6) 
= cos 6 (cos 0)! + (sin @)(—1)(cos 0)~?(— sin 8) 


=1+tan?6 





(Using a trig identity, this can also be rewritten as sec? 6.) 


Page 139, problem c6: 

There are no kinks, endpoints, etc., so extrema will occur only in 
places where the derivative is zero. Applying the chain rule, we find 
the derivative to be cos(sin(sin z)) cos(sin x) cos. This will be zero 
if any of the three factors is zero. We have cosu = 0 only when 
|u| > 7/2, and 7/2 is greater than 1, so it’s not possible for either 
of the first two factors to equal zero. The derivative will therefore 
equal zero if and only if cosx = 0, which happens in the same places 
where the derivative of sin x is zero, at x = 17/2+7n, where n is an 
integer. 
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Page 139, problem c7: 


Taking the derivative and setting it equal to zero, we have (e” — e~*) /2 = 


0, so e* =e *, which occurs only at x = 0. The second derivative is 
(e* +e) /2 (the same as the original function), which is positive 
for all x, so the function is everywhere concave up, and this is a 
minimum. 


Page 141, problem fl: 

Let us first pause to mourn the loss of this perfectly good bottle of 
beer, and to vow that such a thing must never be allowed to happen 
again. 

(a) Since T has units of degrees, both terms on the right-hand side 
must also have units of degrees. The first term on the right is a, so 
a has units of degrees. The second term consists of 6 multiplied by 
an exponential. The exponential is unitless, so 6 must have units of 
degrees. The input to the exponential must be unitless as well, so c 
must have units of inverse seconds (s~*). 

(b) dT / dt = bee~* 

On the left side, the units are what is implied by the original in- 
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237 


terpretation of the Leibniz notation: we have a small change in 
temperature divided by a small change in time, so the units are de- 
grees per second (°/s). On the right, the units come from the factor 
bc, since the exponential is unitless. The units of bc are degrees 
multiplied by inverse seconds, (°)(s~+), and this matches what we 
had on the left-hand side. (c) In this limit, the the temperature 
approaches a, and the derivative approaches zero. It makes sense 
that the derivative goes to zero, since eventually the beer will be in 
thermal equilibrium with the air. 

(d) Physically, a is the temperature of the air, b is the difference in 
temperature at t = 0 between the air and the beer, and c measures 
how good the thermal contact is between the air and the beer — 
e.g., if the beer is in a styrofoam container, c will be small. 


Solutions for chapter 6 


Page 153, problem al: 
All five of these can be done using |’H6pital’s rule: 














oo 35? 
lin = Se es8 
sol s—1 1 
I 1—cosé i sin 0 ; cosOd 1 
= 11 = = 
630 62 20 2 2 
Ox 22 . l0x—2 
lim =li — 
~—>00 v 
1 PP a Dita 
Ty ON at ye gO ee 
noo (n + 2)(n + 3) n+... Zhe 2 
fm 22 torte _), 2art---_), 2a _ 
esoodz2+er+ fe yee ee 2d d 


In examples 2, 4, and 5, we differentiate more than once in order 
to get an expression that can be evaluated by substitution. In 4 
and 5, ...represents terms that we anticipate will go away after 
the second differentiation. Most people probably would not bother 
with l’Hopital’s rule for 3, 4, or 5, being content merely to observe 
the behavior of the highest-order term, which makes the limiting 
behavior obvious. Examples 3, 4, and 5 can also be done rigorously 
without ’Hopit rule, by algebraic manipulation; we divide on the 
top and bottom by the highest power of the variable, giving an 
expression that is no longer an indeterminate form oo/oo. 


Page 153, problem a2: 

Both numerator and denominator go to zero, so we can apply |’Hopital’s 
rule. Differentiating top and bottom gives (cos xz — xsinx)/(—In2- 
2”), which equals —1/In2 at x = 0. To check this numerically, we 
plug x = 107° into the original expression. The result is —1.44219, 
which is very close to —1/In2 = —1.44269.... 
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Page 153, problem a3: 
L’Ho6pital’s rule only works when both the numerator and the de- 
nominator go to zero. 


Page 153, problem a4: 
Applying l’H6pital’s rule once gives 


F 2u 
lim ————_, 
u>0 e4% — e~¥ 


which is still an indeterminate form. Applying the rule a second 
time, we get 


lim ————— = I. 

u>0 e& + e7¥ 

As a numerical check, plugging u = 0.01 into the original expression 
results in 0.9999917. 


Page 153, problem a5: 
L’Ho6pital’s rule gives cost/1 + —1. Plugging in t = 3.1 gives - 
0.9997. 


Solutions for chapter 7 


Page 169, problem e1: 

We have the same power law for differentials as for derivatives, so 
the result is 52B°! dB. Note that the answer is wrong without the 
dB. If we think of differentials as “a little bit of...,” then d(B°?) 
means a tiny change in B°?. It can’t equal 52B°!, because 52B°! is 
not typically going to be tiny. 


Page 169, problem e2: 

As with derivatives, a constant factor just “comes along for the ride,” 
so d(2000BC) = 2000d(BC). We have the same product rule for 
differentials as for derivatives, so the result is 2000(B dC + C'dB). 


Page 169, problem e3: 

We have the same chain rule for differentials as for derivatives. If k 
had been a function of some other variable t, and we’d been taking 
the derivative of sink with respect to t, then we would have had 
cosk dk/ dt. For the differential we have simply cos k dk. 


Page 169, problem e4: 
Applying the sum rule and then the product rule, we have 
pdb+ bdp+ dj. 


Page 170, problem g1: 
Squaring both sides clears the square root. 


y=x?tl 
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Implicit differentiation gives the following. 


2y dy = 2x dx 
dy 2 
dx y 
5 x 
x2 +1 


Page 170, problem i1: 


e*tY dx + xe*t¥ (dx + dy) + dy =0 
(e? +4 + xe™t¥) da + (xe*t¥ + 1) dy =0 


dy _ l+2 erty 
dx 14+ axetty 


Plugging in x = 0 and y = 0 gives dy/ da = —1. 





Solutions for chapter 8 


Page 197, problem al: 
The given equation 
P, — Pi = pgAy 


involves multiplication of a number p by a number gAy. If p is 
not constant, then the proper way to generalize multiplication is 
through an integral. 


Yy2 
tee ail = pg dy 
Y1 


Page 197, problem a2: 
The two options proposed are: 


dy o-r 
es era Tf) 


LE 
pv= [ f(ije-"* dt 
0 


The units of the present value should be dollars. 


The first proposed equation is nonsense based on units, because 
f has units of dollars/year, and its time derivative would therefore 
have units of dollars/year’, not dollars. 


The units of the second equation do make sense. The Leibniz 
notation for the integral is designed so that if you analyze the units 
and treat the integral sign as a sum, the units are what they look 
like they are. On the right-hand side, the units are (dollars/year) x 
years = dollars, which matches the units on the left-hand side. This 


Chapter 10 Applications of the integral 


doesn’t prove that this equation is right, but it doesn’t prove it 
wrong, either. 


Page 197, problem a3: 
The proposed relationships are: 


A derivative represents a rate of change, while an integral repre- 
sents the accumulation of change. Based on these concepts, the first 
equation makes sense: the current tells us how fast our accumulated 
bill is adding up. The second one doesn’t make sense conceptually. 


1 
ie 
0 


This one is wrong because it’s written ungrammatically. It’s wrong 
without the dz, for the reasons explained on p. 180. 


1 
[ova 
0 
1 
[ova 
0 


This one is also correct. It doesn’t matter that a different letter is 
used. The z or u is just a dummy variable. 


Page 198, problem c2: 
(a) 


This one is correct. 


(b) The correct way to notate this is [ (x? +1) dz, so that the 
differential dx is being multiplied by the whole expression. The 
notation [ x? + 1 daz makes it look like the dz is only multiplying 
the 1. 


Page 198, problem e1: 
We know that the derivative of e” is e*. Adding a constant doesn’t 
matter, so two more possibilities are e” + 7 and e* + 13. 


Page 198, problem e2: 


1 
dz = <x? 
fou gee 


Differentiating the right-hand side gives $(2x) = x, which is correct. 
(The derivative of the constant term is zero.) 


Je dz = 42° +¢ 
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Differentiating the right-hand side would give 20x*, which is wrong. 
The coefficient on the right should be 1/5, not 4. 


fe dx =e” +c 


Differentiation gives e*, which is right. 


Je dz =e** +¢ 


Differentiation gives 2e?”, where the factor of 2 in front comes from 
the chain rule. The integral is wrong as written. It should have a 
factor of 1/2 in front. 


le dx =2x° +c 


This is wrong. Raising something to the power 0 simply gives 1, so 
the right-hand side is 1+ c, which is a constant. If we differentiate 
it, we get zero, not x~!. As in example 7, p. 185, the correct integral 
is Inxa+c. 


Page 200, problem il: 
First we put the integrand into the more familiar and convenient 
form cx”, whose integral is (c/(p +1))x?*?: 


/Bav/x _ Bi/2.73/4 


Applying the general rule, the result is (4/7) Bl/227/4. 


Page 201, problem nl: 

(a) As described in the instructions above the problem, force has 
units of newtons (N). Since distance is measured in meters (m), the 
constant k must have units of N/m. 


(b) 


b 1 ae | 
w= ke dx = sha? = —kb? 
, 2 an a 


(c) As described in the instructions, work has units of N-m, so we 
need to check that the expression (1/2)kb? also has these units. The 
1/2 is unitless. The constant k has units of N/m, and multiplying 
these units by meters squares does give N-m. 


Solutions for chapter 9 
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